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Novel DNA cloning method 

Description 

The invention refers to a novel method for cloning DNA molecules using a 
homologous recombination mechanism between at least two DNA 
molecules. Further, novel reagent kits suitable for DNA cloning are provided. 

Current methods for cloning foreign DNA in bacterial cells usually comprise 
the steps of providing a suitable bacterial vector, cleaving said vector with 
a restriction enzyme and in vitro-inserting a foreign DNA fragment in said 
vector. The resulting recombinant vectors are then used to transform 
bacteria. Although such cloning methods have been used successfully for 
about 20 years they suffer from several drawbacks. These drawbacks are, 
in particular, that the in vitro steps required for inserting foreign DNA in a 
vector are often very complicated and time-consuming, If no suitable 
restriction sites are available on the foreign DNA or the vector. 

Furthermore, current methods usually rely on the presence of suitable 
restriction enzyme cleavage sites in the vector into which the foreign DNA 
fragment is placed. This imposes two limitations on the final cloning 
product. First, the foreign DNA fragment can usually only be inserted into 
the vector at the position of such a restriction site or sites. Thus, the 
cloning product is limited by the disposition of suitable restriction sites and 
cloning into regions of the vector where there is no suitable restriction site, 
is difficult and often imprecise. Second, since restriction sites are typically 
4 to 8 base pairs in length, they occur a multiple number of times as the 
size of the DNA molecules being used increases. This represents a practical 
limitation to the size of the DNA molecules that can be manipulated by most 
current cloning techniques. In particular, the larger sizes of DNA cloned into 
vectors such as cosmids, BACs, PACs and Pis are such that it is usually 
impractical to manipulate them directly by restriction enzyme based 
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techniques. Therefore, there is a need for providing a new cloning method, 
from which the drawbacks of the prior art have at least partly been 
eliminated. 

According to the present Invention it was found that an efficient 
homologous recombination mechanism between two DNA molecules occurs 
at usable frequencies in a bacterial host cell which is capable of expressing 
the products of the recE and recT genes or functionally related genes such 
as the reda and redlS genes, or the phage P22 recombination system 
(Kolodner et al., Mol. Microbiol. 1 1 {1 994) 23-30; Fenton, A.C. and Poteete, 
A.R., Virology 134 (1984) 148-160; Poteete, A.R. and Fenton, A.C, 
Virology 134 (1984) 161-167). This novel method of cloning DNA 
fragments is termed "ET cloning". 

The identification and characterization of the E.coli RecE and RecT proteins 
is described Gillen et al. (J.BacterioL 145 (1981), 521-532) and Hall et al. 
(J.BacterioI. 175 (1993), 277-287). Hall and Kolodner (Proc.Natl.Acad.Sci. 
USA 91 (1994), 3205-3209) disclose in vitro homologous pairing and 
strand exchange of linear double-stranded DNA and homologous circular 
single-stranded DNA promoted by the RecT protein. Any references to the 
use of this method for the cloning of DNA molecules in cells cannot be 
found therein. 

The recET pathway of genetic recombination in E.coli is known (Hall and 
Kolodner (1 994), supra; Gillen et al. (1 981 ), supra). This pathway requires 
the expression of two genes, recE and recT. The DNA sequence of these 
genes has been published (Hall et al., supra). The RecE protein is similar to 
bacteriophage proteins, such as X exo or \ Reda (Gillen et al., 
J.MoLBiol.113 (1977), 27-41; Little, J.Biol.Chem. 242 (1967), 679-686; 
Redding and Carter, J.Biol.Chem. 246 (1971), 2513-2518; Joseph and 
Kolodner, J.Biol.Chem. 258 (1983), 10418-10424). The RecT protein is 
similar to bacteriophage proteins, such as A li-protein or A RedS (Hall et al. 
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(1993), supra; Muniyappa and Radding, J.Biol.Chem. 261 11986), 7472- 
7478; Kmiec and Hollomon, J.Biol.Chem.256 (1 981 ), 1 2636-1 2639). The 
content of the above-cited documents is incorporated herein by reference. 

Oliner et al. (Nuci. Acids Res. 21 (1993), 5192-5197) describe in vivo 
cloning of PGR products in E.coli by intermolecular homologous 
recombination between a linear PGR product and a linearized plasmid 
vector. Other previous attempts to develop new cloning methods based on 
homologous recombination in prokaryotes, too, relied on the use of 
restriction enzymes to linearise the vector (Bubeck et al., Nucleic Acids Res. 
21 (1993), 3601-3602; Oliner eta!., Nucleic Acids Res. 21 (1993), 5192- 
5197; Degryse, Gene 170 (1996), 45-50) or on the host-specific recA- 
dependent recombination system (Hamilton et al., J.Bacteriol. 171 (1989), 
4617-4622; Yang et al.. Nature Biotech. 15 (1997), 859-865; Dabert and 
Smith, Genetics 145 (1 997), 877-889). These methods are of very limited 
applicability and are hardly used in practice. 

The novel method of cloning DNA according to the present invention does 
not require in vitro treatments with restriction enzymes or DNA ligases and 
is therefore fundamentally distinct from the standard methodologies of DNA 
cloning. The method relies on a pathway of homologous recombination in 
E.coli involving the recE and recT gene products, or the reda and redS gene 
products, or functionally equivalent gene products. The method covalently 
combines one preferably linear and preferably extrachromosomal DNA 
fragment, the DNA fragment to be cloned, with one second preferably 
circular DNA vector molecule, either an episome or the endogenous host 
chromosome or chromosomes. It is therefore distinct from previous 
descriptions of cloning in E.coli by homologous recombination which either 
rely on the use of two linear DNA fragments or different recombination 
pathways. 
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The present invention provides a flexible way to use homologous 
recombination to engineer large DNA molecules including an intact > 76 kb 
plasmid and the E.coli chromosome. Thus, there is practically no limitation 
of target choice either according to size or site. Therefore, any recipient 
DNA in a host cell, from high copy plasmid to the genome, is amenable to 
precise alteration. In addition to engineering large DNA molecules, the 
invention outlines new, restriction enzyme-independent approaches to DNA 
design. For example, deletions between any two chosen base pairs in a 
target episome can be made by choice of oligonucleotide homology arms. 
Similarly, chosen DNA sequences can be inserted at a chosen base pair to 
create, for example, altered protein reading frames. Concerted combinations 
of insertions and deletions, as well as point mutations, are also possible. 
The application of these strategies is particularly relevant to complex or 
difficult DNA constructions, for example, those intended for homologous 
recombinations in eukaryotic cells, e.g. mouse embryonic stem cells. 
Further, the present invention provides a simple way to position site specific 
recombination target sites exactly where desired. This will simplify 
applications of site specific recombination in other living systems, such as 
plants and mice. 

A subject matter of the present Invention is a method for cloning DNA 
molecules in cells comprising the steps: 

a) providing a host cell capable of performing homologous 
recombination, 

b) contacting in said host cell a first DNA molecule which is capable 
of being replicated in said host cell with a second DNA molecule 
comprising at least two regions of sequence homology to regions on 
the first DNA molecule, under conditions which favour homologous 
recombination between said first and second DNA molecules and 

c) selecting a host cell in which homologous recombination between 
said first and second DNA molecules has occurred. 
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In the method of the present invention the homologous recombination 
preferably occurs via the recET mechanism, i.e. the homologous 
recombination is mediated by the gene products of the recE and the recT 
genes which are preferably selected from the E.coli genes recE and recT or 
functionally related genes such as the phage A reda and redR genes. 

The host cell suitable for the method of the present invention preferably is 
a bacterial cell, e.g. a gram-negative bacterial cell. More preferably, the host 
cell is an enterobacterial ceil, such as Salmonella, Klebsiella or Escherichia. 
Most preferably the host cell is an Escherichia coli cell. It should be noted, 
however, that the cloning method of the present invention is also suitable 
for eukaryotic cells, such as fungi, plant or animal cells. 

Preferably, the host cell used for homologous recombination and 
propagation of the cloned DNA can be any cell, e.g. a bacterial strain in 
which the products of the recE and recT, or reda and redS, genes are 
expressed. The host cell may comprise the recE and recT genes located on 
the host cell chromosome or on non-chromosomal DNA, preferably on a 
vector, e.g. a plasmid. In a preferred case, the RecE and RecT, or Reda and 
RedfS, gene products are expressed from two different regulatable 
promoters, such as the arabinose-inducible BAD promoter or the lac 
promoter or from non-regulatable promoters. Alternatively, the recE and 
recT, or reda and redfl, genes are expressed on a polycistronic mRNA from 
a single regulatable or non-regulatable promoter. Preferably the expression 
is controlled by regulatable promoters. 

Especially preferred is also an embodiment, wherein the recE or reda gene 
is expressed by a regulatable promoter. Thus, the recombinogenic potential 
of the system is only elicited when required and, at other times, possible 
undesired recombination reactions are limited. The recT or redfi gene, on 
the other hand, is preferably overexpressed with respect to recE or reda. 
This may be accomplished by using a strong constitutive promoter, e.g. the 
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EM7 promoter and/or by using a higher copy number of recT, or redS, 
versus recE, or reda, genes. 

For the purpose of the present invention any recE and recT genes are 
suitable insofar as they allow a homologous recombination of first and 
second DNA molecules with sufficient efficiency to give rise to 
recombination products in more than 1 in 10^ cells transfected with DNA. 
The recE and recT genes may be derived from any bacterial strain or from 
bacteriophages or may be mutants and variants thereof. Preferred are recE 
and recT genes which are derived from E.coli or from E.coli bacteriophages, 
such as the reda and redlS genes from lambdoid phages, e.g. bacteriophage 
A. 

More preferably, the recE or reda gene is selected from a nucleic acid 
molecule comprising 

(a) the nucleic acid sequence from position 1320 (ATG) to 2159 (GAC) as 
depicted in Fig.7B, 

(b) the nucleic acid sequence from position 1320 (ATG) to 1998(CGA) as 
depicted in Fig.14B, 

(c) a nucleic acid encoding the same polypeptide within the degeneracy of 
the genetic code and/or 

Id) a nucleic acid sequence which hybridizes under stringent conditions with 
the nucleic acid sequence from (a), (b) and/or (c). 

More preferably, the recT or redB gene is selected from a nucleic acid 
molecule comprising 

(a) the nucleic acid sequence from position 2155 (ATG) to 2961 (GAA) as 
depicted in Fig.7B, 

(b) the nucleic acid sequence from position 2086 (ATG) to 2868 (GCA) as 
depicted in Fig. 1 48, 

(c) a nucleic acid encoding the same polypeptide within the degeneracy of 
the genetic code and/or 
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Id) a nucleic acid sequence which hybridizes under stringent conditions with 
the nucleic acid sequences from (a), (b) and/or (c). 

It should be noted that the present invention also encompasses mutants and 
variants of the given sequences, e.g. naturally occurring mutants and 
variants or mutants and variants obtained by genetic engineering. Further 
it should be noted that the recE gene depicted in Fig.7B is an already 
truncated gene encoding amino acids 588-866 of the native protein. 
Mutants and variants preferably have a nucleotide sequence identity of at 
least 60%, preferably of at least 70% and more preferably of at least 80% 
of the recE and recT sequences depicted in Fig.7B and 1 3B, and of the reda 
and redB sequences depicted in Fig, 148. 

According to the present invention hybridization under stringent conditions 
preferably is defined according to Sambrook et al. (1989), infra, and 
comprises a detectable hybridization signal after washing for 30 min in 0.1 
X SSC, 0.5% SDS at 55°C, preferably at 62^C and more preferably at 
68°C. 

In a preferred case the recE and recT genes are derived from the 
corresponding endogenous genes present in the E.coli K12 strain and its 
derivatives or from bacteriophages. In particular, strains that carry the sbcA 
mutation are suitable. Examples of such strains are JC8679 and JC 9604 
(Gillen et al. (1981), supra). Alternatively, the corresponding genes may 
also be obtained from other coliphages such as lambdoid phages or phage 
P22. 

The genotype of JC 8679 and JC 9604 is Sex (Hfr, F + , F-, or F') : F-JC 
8679 comprises the mutations: recBC 21 , recC 22, sbcA 23, thr-1 , ara-1 4, 
leu B 6, DE (gpt-proA) 62, lacYI , tsx-33, gluV44 (AS), galK2 (Oc), LAM^ 
his-60, relA 1 , rps L31 (strR), xyl A5, mtl-1 , argE3 (Oc) and thi-1 . JC 9604 
comprises the same mutations and further the mutation recA 56. 
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Further, it should be noted that the recE and recT, or reda and redli, genes 
can be isolated from a first donor source, e.g. a donor bacterial cell and 
transformed into a second receptor source, e.g. a receptor bacterial or 
eukaryotic cell in which they are expressed by recombinant DNA means. 

In one embodiment of the invention, the host cell used is a bacterial strain 
having an sbcA mutation, e.g. one of E.coli strains JC 8679 and JC 9604 
mentioned above. However, the method of the invention is not limited to 
host cells having an sbcA mutation or analogous cells. Surprisingly, It has 
been found that the cloning method of the invention also works in cells 
without sbcA mutation, whether recBC + or recBC-, e.g. also in prokaryotic 
recBC + host cells, e.g. in E.coli recBC + cells. In that case preferably those 
host cells are used in which the product of a recBC type exonuclease 
inhibitor gene is expressed. Preferably, the exonuclease inhibitor is capable 
of inhibiting the host recBC system or an equivalent thereof. A suitable 
example of such exonuclease inhibitor gene is the A red^ gene (Murphy, 
J.Bacteriol. 173 (1991), 5808-5821) and functional equivalents thereof, 
respectively, which, for example, can be obtained from other coliphages 
such as from phage P22 (Murphy, J.BioLChem.269 (1 994), 22507-2251 6). 

More preferably, the exonuclease inhibitor gene is selected from a nucleic 
acid molecule comprising 

(a) the nucleic acid sequence from position 3588 (ATG) to 4002 (GTA) as 
depicted in Fig.14A, 

(b) a nucleic acid encoding the same polypeptide within the degeneracy of 
the genetic code and/or 

(c) a nucleic acid sequence which hybridizes under stringent conditions (as 
defined above) with the nucleic acid sequence from (a) and/ or (b). 



Surprisingly, it has been found that the expression of an exonuclease 
inhibitor gene in both recBC+ and recBC- strains leads to significant 
improvement of cloning efficiency. 



wo 99/29837 




PCT/EP98/07945 



- 9 - 

The cloning method according to the present invention employs a 
homologous recombination between a first DNA molecule and a second 
DNA molecule. The first DNA molecule can be any DNA molecule that 
carries an origin of replication which is operative in the host cell, e.g. an 
E.coli replication origin. Further, the first DNA molecule is present in a form 
which is capable of being replicated in the host cell. The first DNA 
molecule, i.e. the vector, can be any extrachromosomal DNA molecule 
containing an origin of replication which is operative in said host cell, e.g. 
a plasmid including single, low, medium or high copy plasmids or other 
extrachromosomal circular DNA molecules based on cosmid, PI, BAG or 
PAC vector technology. Examples of such vectors are described, for 
example, by Sambrook et al. (Molecular Cloning, Laboratory Manual, 2nd 
Edition (1989), Cold Spring Harbor Laboratory Press) and loannou et al. 
(Nature Genet, 6 (1 994), 84-89) or references cited therein. The first DNA 
molecule can also be a host cell chromosome, particularly the E.coli 
chromosome. Preferably, the first DNA molecule is a double-stranded DNA 
molecule. 

The second DNA molecule is preferably a linear DNA molecule and 
comprises at least two regions of sequence homology, preferably of 
sequence identity to regions on the first DNA molecule. These homology or 
identity regions are preferably at least 1 5 nucleotides each, more preferably 
at least 20 nucleotides and, most preferably, at least 30 nucleotides each. 
Especially good results were obtained when using sequence homology 
regions having a length of about 40 or more nucleotides, e.g. 60 or more 
nucleotides. The two sequence homology regions can be located on the 
linear DNA fragment so that one is at one end and the other is at the other 
end, however they may also be located internally. Preferably, also the 
second DNA molecule is a double-stranded DNA molecule. 

The two sequence homology regions are chosen according to the 
experimental design. There are no limitations on which regions of the first 
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DNA molecule can be chosen for the two sequence honnology regions 
located on the second DNA molecule, except that the homologous 
recombination event cannot delete the origin of replication of the first DNA 
molecule. The sequence homology regions can be interrupted by non- 
identical sequence regions as long as sufficient sequence homology is 
retained for the homologous recombination reaction. By using sequence 
homology arms having non-identical sequence regions compared to the 
target site mutations such as substitutions, e.g. point mutations, insertions 
and/or deletions may be introduced into the target site by ET cloning. 

The second foreign DNA molecule which is to be cloned in the bacterial cell 
may be derived from any source. For example, the second DNA molecule 
may be synthesized by a nucleic acid amplification reaction such as a PGR 
where both of the DNA oligonucleotides used to prime the amplification 
contain in addition to sequences at the 3'-ends that serve as a primer for 
the amplification, one or the other of the two homology regions. Using 
oligonucleotides of this design, the DNA product of the amplification can be 
any DNA sequence suitable for amplification and will additionally have a 
sequence homology region at each end. 

A specific example of the generation of the second DNA molecule is the 
amplification of a gene that serves to convey a phenotypic difference to the 
bacterial host cells, in particular, antibiotic resistance. A simple variation of 
this procedure involves the use of oligonucleotides that include other 
sequences in addition to the PGR primer sequence and the sequence 
homology region. A further simple variation is the use of more than two 
amplification primers to generate the amplification product. A further simple 
variation is the use of more than one amplification reaction to generate the 
amplification product. A further variation is the use of DNA fragments 
obtained by methods other than PGR, for example, by endonuclease or 
restriction enzyme cleavage to linearize fragments from any source of DNA. 
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species of DNA molecule. It is of course possible to use a heterogenous 
population of second DNA molecules, e.g. to generate a DNA library, such 
as a genomic or cDNA library. 

The method of the present invention may comprise the contacting of the 
first and second DNA molecules In vivo. In one embodiment of the present 
invention the second DNA fragment is transformed into a bacterial strain 
that already harbors the first vector DNA molecule. In a different 
embodiment, the second DNA molecule and the first DNA molecule are 
mixed together in vitro before co-transformation in the bacterial host cell. 
These two embodiments of the present invention are schematically depicted 
in Fig. 1 . The method of transformation can be any method known in the art 
(e.g. Sambrook et al. supra). The preferred method of transformation or co- 
transformation, however, is electroporation. 

After contacting the first and second DNA molecules under conditions 
which favour homologous recombination between first and second DNA 
molecules via the ET cloning mechanism a host cell is selected, in which 
homologous recombination between said first and second DNA molecules 
has occurred. This selection procedure can be carried out by several 
different methods. In the following three preferred selection methods are 
depicted in Fig. 2 and described in detail below. 

In a first selection method a second DNA fragment is employed which 
carries a gene for a marker placed between the two regions of sequence 
homology wherein homologous recombination is detectable by expression 
of the marker gene. The marker gene may be a gene for a phenotypic 
marker which is not expressed in the host or from the first DNA molecule. 
Upon recombination by ET cloning, the change in phenotype of the host 
strain conveyed by the stable acquisition of the second DNA fragment 
identifies the ET cloning product. 
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In a preferred case, the phenotypic marker is a gene that conveys resistance 
to an antibiotic, in particular, genes that convey resistance to Icanamycin, 
ampiilicin, chloramphenicol, tetracyclin or any other substance that shows 
bacteriocidal or bacteriostatic effects on the bacterial strain employed. 

A simple variation Is the use of a gene that complements a deficiency 
present within the bacterial host strain employed. For example, the host 
strain may be mutated so that it is incapable of growth without a metabolic 
supplement. In the absence of this supplement, a gene on the second DNA 
fragment can complement the mutational defect thus permitting growth. 
Only those cells which contain the episome carrying the intended DNA 
rearrangement caused by the ET cloning step will grow. 

In another example, the host strain carries a phenotypic marker gene which 
is mutated so that one of its codons is a stop codon that truncates the open 
reading frame. Expression of the full length protein from this phenotypic 
marker gene requires the introduction of a suppressor tRNA gene which, 
once expressed, recognizes the stop codon and permits translation of the 
full open reading frame. The suppressor tRNA gene is introduced by the ET 
cloning step and successful recombinants identified by selection for, or 
identification of, the expression of the phenotypic marker gene. In these 
cases, only those cells which contain the intended DNA rearrangement 
caused by the ET cloning step will grow. 

A further simple variation is the use of a reporter gene that conveys a 
readily detectable change in colony colour or morphology. In a preferred 
case, the green fluorescence protein (GFP) can be used and colonies 
carrying the ET cloning product identified by the fluorescence emissions of 
GFP. In another preferred case, the lacZ gene can be used and colonies 
carrying the ET cloning product identified by a blue colony colour when X- 
gal is added to the culture medium. 



wo 99/29837 




PCT/EP98/07945 



- 13 - 

In a second selection method the insertion of the second DN A fragment into 
the first DNA molecule by ET cloning alters the expression of a marker 
present on the first DNA molecule. In this embodiment the first DNA 
molecule contains at least one marker gene between the two regions of 
sequence homology and homologous recombination may be detected by an 
altered expression, e.g. lack of expression of the marker gene. 

In a preferred application, the marker present on the first DNA molecule is 
a counter-selectable gene product, such as the sacB, ccdB or tetracycline- 
resistance genes. In these cases, bacterial cells that carry the first DNA 
molecule unmodified by the ET cloning step after transformation with the 
second DNA fragment, or co-transformation with the second DNA fragment 
and the first DNA molecule, are plated onto a medium so the expression of 
the counter-selectable marker conveys a toxic or bacteriostatic effect on the 
host. Only those bacterial cells which contain the first DNA molecule 
carrying the intended DNA rearrangement caused by the ET cloning step 
will grow. 

In another preferred application, the first DNA molecule carries a reporter 
gene that conveys a readily detectable change in colony colour or 
morphology. In a preferred case, the green fluorescence protein (GFP) can 
be present on the first DNA molecule and colonies carrying the first DNA 
molecule with or without the ET cloning product can be distinguished by 
differences in the fluorescence emissions of GFP. In another preferred case, 
the lacZ gene can be present on the first DNA molecule and colonies 
carrying the first DNA molecule with or without the ET cloning product 
identified by a blue or white colony colour when X-gal is added to the 
culture medium. 

In a third selection method the integration of the second DNA fragment into 
the first DNA molecule by ET cloning removes a target site for a site 
specific recombinase, termed here an RT (for recombinase target) present 
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on the first DNA molecule between the two regions of sequence homology. 
A homologous recombination event may be detected by removal of the 
target site. 

In the absence of the ET cloning product, the RT is available for use by the 
corresponding site specific recombinase. The difference between the 
presence or not of this RT is the basis for selection of the ET cloning 
product. In the presence of this RT and the corresponding site specific 
recombinase, the site specific recombinase mediates recombination at this 
RT and changes the phenotype of the host so that it is either not able to 
grow or presents a readily observable phenotype. In the absence of this RT, 
the corresponding site specific recombinase is not able to mediate 
recombination. 

In a preferred case, the first DNA molecule to which the second DNA 
fragment is directed, contains two RTs, one of which is adjacent to, but not 
part of, an antibiotic resistance gene. The second DNA fragment is directed, 
by design, to remove this RT. Upon exposure to the corresponding site 
specific recombinase, those first DNA molecules that do not carry the ET 
cloning product will be subject to a site specific recombination reaction 
between the RTs that remove the antibiotic resistance gene and therefore 
the first DNA molecule fails to convey resistance to the corresponding 
antibiotic. Only those first DNA molecules that contain the ET cloning 
product, or have failed to be site specifically recombined for some other 
reason, will convey resistance to the antibiotic. 

In another preferred case, the RT to be removed by ET cloning of the 
second DNA fragment is adjacent to a gene that complements a deficiency 
present within the host strain employed. In another preferred case, the RT 
to be removed by ET cloning of the second DNA fragment is adjacent to a 
reporter gene that conveys a readily detectable change in colony colour or 
morphology. 
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In another preferred case, the RT to be removed by ET cloning of the 
second DNA fragment is anywhere on a first episomal DNA molecule and 
the episome carries an origin of replication incompatible with survival of the 
bacterial host cell if it is integrated into the host genome. In this case the 
host genome carries a second RT, which may or may not be a mutated RT 
so that the corresponding site specific recomblnase can integrate the 
episome, via its RT, into the RT sited in the host genome. Other preferred. 
RTs include RTs for site specific recombinases of the resolvase/transposase 
class. RTs include those described from existing examples of site specific 
recombination as well as natural or mutated variations thereof. 

The preferred site specific recombinases include Cre, FLP, Kw or any site 
specific recombinase of the integrase class. Other preferred site specific 
recombinases include site specific recombinases of the 
resolvase/transposase class. 

There are no limitations on the method of expression of the site specific 
recomblnase In the host cell. In a preferred method, the expression of the 
site specific recombinase is regulated so that expression can be induced and 
quenched according to the optimisation of the ET cloning efficiency. In this 
case, the site specific recombinase gene can be either integrated into the 
host genome or carried on an episome. In another preferred case, the site 
specific recombinase is expressed from an episome that carries a 
conditional origin of replication so that it can be eliminated from the host 
cell. 

In another preferred case, at least two of the above three selection methods 
are combined. A particularly preferred case Involves a two-step use of the 
first selection method above, followed by use of the second selection 
method. This combined use requires, most simply, that the DNA fragment 
to be cloned includes a gene, or genes that permits the identification, in the 
first step, of correct ET cloning products by the acquisition of a phenotypic 
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required for homologous recombination. The homologous recombination 
event, however, may also occur in vivo, e.g. by introducing RecE and RecT, 
or Reda and RedB, proteins or the extract in a host cell (which may be 
recET positive or not, or redaS positive or not) and contacting the DNA 
molecules in the host cell. When the recombination occurs in vitro the 
selection of DNA molecules may be accomplished by transforming the 
recombination mixture in a suitable host cell and selecting for positive 
clones as described above. When the recombination occurs in vivo the 
selection methods as described above may directly be applied. 

A further subject matter of the invention is the use of cells, preferably 
bacterial cells, most preferably, E.coli cells capable of expressing the recE 
and recT, or reda and redB, genes as a host cell for a cloning method 
involving homologous recombination. 

Still a further subject matter of the invention is a vector system capable of 
expressing recE and recT, or reda and redB, genes in a host cell and its use 
for a cloning method involving homologous recombination. Preferably, the 
vector system is also capable of expressing an exonuclease inhibitor gene 
as defined above, e.g. the A redKgene. The vector system may comprise at 
least one vector. The recE and recT, or reda and redB, genes are preferably 
located on a single vector and more preferably under control of a 
regulatable promoter which may be the same for both genes or a single 
promoter for each gene. Especially preferred is a vector system which is 
capable of overexpressing the recT, or redB, gene versus the recE, or reda, 
gene. 

Still a further subject matter of the invention is the use of a source of RecE 
and RecT, or Reda and RedB, proteins for a cloning method involving 
homologous recombination. 
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A Still further subject matter of the invention is a reagent kit for cloning 
comprising 

(a) a host cell, preferably a bacterial host cell, 

(b) means of expressing recE and recT, or reda and redB, genes in 
said host cell, e.g. comprising a vector system, and 

(c) a recipient cloning vehicle, e.g. a vector, capable of being 
replicated in said cell. 

On the one hand, the recipient cloning vehicle which corresponds to the 
first DNA molecule of the process of the invention can already be present 
in the bacterial cell. On the other hand, it can be present separated from the 
bacterial cell. 

In a further embodiment the reagent kit comprises 

(a) a source for RecE and RecT, or Reda and RedB, proteins and 

(b) a recipient cloning vehicle capable of being propagated in a host cell and 

(c) optionally a host cell suitable for propagating said recipient cloning 
vehicle. 

The reagent kit furthermore contains, preferably, means for expressing a 
site specific recombinase in said host cell, in particular, when the recipient 
ET cloning product contains at least one site specific recombinase target 
site. Moreover, the reagent kit can also contain DNA molecules suitable for 
use as a source of linear DNA fragments used for ET cloning, preferably by 
serving as templates for PGR generation of the linear fragment, also as 
specifically designed DNA vectors from which the linear DNA fragment is 
released by restriction enzyme cleavage, or as prepared linear fragments 
included in the kit for use as positive controls or other tasks. Moreover, the 
reagent kit can also contain nucleic acid amplification primers comprising 
a region of homology to said vector. Preferably, this region of homology is 
located at the 5'-end of the nucleic acid amplification primer. 
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The invention is further illustrated by the following Sequence listings. 
Figures and Examples. 

SEQ ID NO. 1 : shows the nucleic acid sequence of the plasmid 

pBAD24-rec ET (Fig. 7). 
SEQ ID NOs 2/3: show the nucleic acid and amino acid sequences of the 

truncated recE gene (t-recE) present on pBAD24-recET 

at positions 1320-2162. 
SEQ ID NOs 4/5: show the nucleic acid and amino acid sequences of the 

recT gene present on pBAD24-recET at position 21 55- 

2972. 

SEQ ID NOs 6/7: show the nucleic acid and amino acid sequences of the 
araC gene present on the complementary stand to the 
one shown of pBAD24-recET at positions 974-996. 

SEQ ID NOs 8/9: show the nucleic acid an amino acid sequences of the 
bla gene present on pBAD24-recET at positions 3493- 
4353. 

SEQ ID NO 10: shows the nucleic acid sequence of the plasmid pBAD- 
ETk (Fig. 13). 

SEQ ID No 11: shows the nucleic acid sequence of the plasmid pBAD- 
oBk (Fig. 14) as well as the coding regions for the 
genes reda (1320-200), redB (2086-2871) and redK 
(3403-3819). 

SEQ ID NOs 12-14: show the amino acid sequences of the Reda, 

RedB and RedK proteins, respectively. The redK 
sequence is present on each of pBAD-ETK (Fig. 
13) and pBAD-aBK (Fig. 14). 

Figure 1 



A preferred method for ET cloning is shown by diagram. The linear DNA 
fragment to be cloned is synthesized by PGR using oligonucleotide primers 
that contain a left homology arm chosen to match sequences in the 
recipient episome and a sequence for priming in the PGR reaction, and a 
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right homology arm chosen to match another sequence in the recipient 
episome and a sequence for priming in the PGR reaction. The product of the 
PGR reaction, here a selectable marker gene (smi), is consequently flanked 
by the left and right homology arms and can be mixed together in vitro with 
the episome before co-transformation, or transformed into a host cell 
harboring the target episome. The host cell contains the products of the 
recE and recT genes. ET cloning products are identified by the combination 
of two selectable markers, smi and sm2 on the recipient episome. 

Figure 2 

Three ways to identify ET cloning products are depicted. The first, (on the 
left of the figure), shows the acquisition, by ET cloning, of a gene that 
conveys a phenotypic difference to the host, here a selectable marker gene 
Ism). The second (in the centre of the figure) shows the loss, by ET cloning, 
of a gene that conveys a phenotypic difference to the host, here a counter 
selectable marker gene (counter-sm). The third shows the loss of a target 
site (RT, shown as triangles on the circular episome) for a site specific 
recombinase (SSR), by ET cloning. In this case, the correct ET cloning 
product deletes one of the target sites required by the SSR to delete a 
selectable marker gene (sm). The failure of the SSR to delete the sm gene 
identifies the correct ET cloning product. 



Figure 3 

A simple example of ET cloning is presented. 

(a) Top panel - PGR products (left lane) synthesized from oligonucleotides 
designed as described in Fig.1 to amplify by PGR a kanamycin resistance 
gene and to be flanked by homology arms present in the recipient vector, 
were mixed in vitro with the recipient vector (2nd lane) and cotransformed 
into a recET+ E.coli host. The recipient vector carried an ampillicin 
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resistance gene, (b) Transformation of the sbcA E.coli strain JC9604 with 
either the PGR product alone (0.2 //g) or the vector alone (0.3 //g) did not 
convey resistance to double selection with amplcillin and kanamycin 
(annp + kan), however cotransformation of both the PGR product and the 
vector produced double resistant colonies. More than 95 % of these colonies 
contained the correct ET cloning product where the kanamycin gene had 
precisely integrated into the recipient vector according to the choice of 
homology arms. The two lanes on the right of (a) show Pvu II restriction 
enzyme digestion of the recipient vector before and after ET cloning, (c) As 
for b, except that six PGR products (0.2 ^g each) were cotransformed with 
pSVpaZ1 1 (0.3 ^.g each) into JG9604 and plated onto Amp + Kan plates or 
Amp plates. Results are plotted as Amp + Kan-resistant colonies/ 
representing recombination products, divided by Amp-resistant colonies, 
representing the plasmid transformation efficiency of the competent cell 
preparation, x 10«. The PGR products were equivalent to the a-b PGR 
product except that homology arm lengths were varied. Results are from 
five experiments that used the same batches of competent cells and DNAs. 
Error bars represent standard deviation, (d) Eight products flanked by 50 bp 
homology arms were cotransformed with pSVpaZ1 1 into JG9604. All eight 
PGR products contained the same left homology arm and amplified neo 
gene. The right homology arms were chosen from the pSVpaZ1 1 sequence 
to be adjacent to (0), or at increasing distances (7-3100 bp), from the left. 
Results are from four experiments. 



Figure 4 



ET cloning in an approximately 1 0Okb PI vector to exchange the selectable 
marker. 

A PI clone which uses a kanamycin resistance gene as selectable marker 
and which contains at least 70kb of the mouse Hox a gene cluster was 
used. Before ET cloning, this episome conveys kanamycin resistance (top 
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panel, upper left) to its host E.coli which are ampillicin sensitive (top panel, 
upper right). A linear DNA fragment designed to replace the kanamycin 
resistance gene with an ampillicin resistance gene was made by PGR as 
outlined In Fig.1 and transformed into E.coli host cells in which the recipient 
Hox a/Pl vector was resident. ET cloning resulted in the deletion of the 
kanamycin resistance gene, and restoration of kanamycin sensitivity (top 
panel, lower left) and the acquisition of ampillicin resistance (top panel, 
lower right) . Precise DNA recombination was verified by restriction digestion 
and Southern blotting analyses of isolated DNA before and after ET cloning 
(lower panel). 

Figure 5 



ET cloning to remove a counter selectable marker 

A PGR fragment (upper panel, left, third lane) made as outlined in Figs.1 
and 2 to contain the kanamycin resistance gene was directed by its chosen 
homology arms to delete the counter selectable ccdB gene present In the 
vector, pZerO'2.1. The PGR product and the pZero vector were mixed In 
vitro (upper panel, left, 1 st lane) before cotransformation into a recE/recT 4- 
E.coli host. Transformation of pZero-2.1 alone and plating onto kanamycin 
selection medium resulted in little colony growth (lower panel, left). 
Gotransformation of pZero-2.1 and the PGR product presented ET cloning 
products (lower panel, right) which showed the intended molecular event 
as visualized by Pvu II digestion (upper panel, right). 

Figure 6 

ET cloning mediated by inducible expression of recE and recT from an 
episome. 

RecE/RecT mediate homologous recombination between linear and circular 
DNA molecules, (a) The plasmid pBAD24-recET was transformed into E.coli 
JC5547, and then batches of competent cells were prepared after induction 
of RecE/RecT expression by addition of L-arabinose for the times indicated 
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before harvesting. A PCR product, made using oligonucleotides e and f to 
contain the chloramphenicol resistance gene (cm) of pMAK705 and 50 bp 
homology arms chosen to flank the ampicilllin resistance gene (bla) of 
pBAD24-recET, was then transformed and recombinants identified on 
chloramphenicol plates, (b) Arabinose was added to cultures of pBAD24- 
recETtransformedJC5547fordifferenttimes immediately before harvesting 
for competent cell preparation. Total protein expression was analyzed by 
SDS-PAGE and Coomassle blue staining, (c) The number of chloramphenicol 
resistant colonies per fjg of PCR product was normalized against a control 
for transformation efficiency, determined by including 5 pg pZero2.1, 
conveying kanamycin resistance, in the transformation and plating an 
aliquot onto Kan plates. 

Figure 7A 

The plasmid pBAD24-recET is shown by diagram. The plasmid contains the 
genes recE (in a truncated form) and recT under control of the inducible 
BAD promoter (Pbad)- The plasmid further contains an ampilllcin resistance 
gene (Amp') and an araC gene. 

Figure 78 

The nucleic acid sequence and the protein coding portions of pBAD24-recET 
are depicted. 



Manipulation of a large E.coli episome by multiple recombination steps, a 
Scheme of the recombination reactions. A PI clone of the Mouse Hoxa 
complex, resident in JC9604, was modified by recombination with PCR 
products that contained the neo gene and two Flp recombination targets 



Figure 8 
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(FRTs). The two PCR products were identical except tliat one was flanked 
by g and h homology arms {insertion), and the other was flanked by i and 
h homology arms (deletion). In a second step, the neo gene was removed 
by Flp recombination between the FRTs by transient transformation of a Flp 
expression plasmid based on the pSCIOI temperature-sensitive origin (ts 
ori). b Upper panel; ethidium bromide stained agarose gel showing EcoRI 
digestions of PI DNA preparations from three independent colonies for each 
step. Middle panel; a Southern blot of the upper panel hybridized with a neo 
gene probe. Lower panel; a Southern blot of the upper panel hybridized with 
a Hoxa3 probe to visualize the site of recombination. Lanes 1, the original 
Hoxa3 PI clone grown in E.coli strain NS31 45. Lanes 2, replacement of the 
Tn903 kanamycin resistance gene resident in the PI vector with an 
ampiciiiin resistance gene increased the 8.1 kb band (lanes 1), to 9.0 kb. 
Lanes 3, insertion of the Tn5-neo gene with g-h homology arms upstream 
of Hoxa3, increased the 6.7 kb band (lanes 1,2) to 9.0 kb. Lanes 4, Flp 
recombinase deleted the g-h neo gene reducing the 9.0 kb band (lanes 3) 
back to 6.7 kb. Lanes 5, deletion of 6 kb of Hoxa3 - 4 intergenic DNA by 
replacement with the i-h neo gene, decreased the 6.7 kb band (lanes 2) to 
4.5 kb. Lanes 6, Flp recombinase deleted the i-h neo gene reducing the 4.5 
kb band to 2.3 kb. 

Figure 9 

Manipulation of the E.coli chromosome. A Scheme of the recombination 
reactions. The endogenous lacZ gene of JC9604 at 7.8' of the E.coli 
chromosome, shown in expanded form with relevant Ava I sites and 
coordinates, was targeted by a PCR fragment that contained the neo gene 
flanked by homology arms j and k, and loxP sites, as depicted. Integration 
of the neo gene removed most of the lacZ gene including an Ava I site to 
alter the 1443 and 3027 bp bands into a 3277 bp band. In a second step, 
the neo gene was removed by Cre recombination between the loxPs by 
transient transformation of a Cre expression plasmid based on the pSCIOI 
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temperature-sensitive origin (ts ori). Removal of the neo gene by Cre 
recombinase reduces the 3277 band to 2111 bp. b B-galactosidase 
expression evaluated by streaking colonies on X-Gal plates. The top row of 
three streaks show S-galactosidase expression in the host JC9604 strain 
(w.t.), the lower three rows (Km) show 24 independent primary colonies, 
20 of which display a loss of 15-galactosidase expression Indicactive of the 
Intended recombination event, c Southern analysis of E.coli chromosomal 
DNA digested with Ava I using a random primed probe made from the entire 
lacZ coding region; lanes 1,2, w.t.; lanes 3-6, four independent white 
colonies after integration of the j-k neo gene; lanes 7-10; the same four 
colonies after transient transformation with the Cre expression plasmid. 



Figure 10 



Two rounds of ET cloning to introduce a point mutation, a Scheme of the 
recombination reactions. The lacZ gene of pSVpaXI was disrupted in 
JC9604lacZ, a strain made by the experiment of Fig.9 to ablate endogenous 
lacZ expression and remove competitive sequences, by a sacB-neo gene 
cassette, synthesized by PGR to plB279 and flanked by I and m homology 
arms. The recombinants, termed pSV-sacB-neo, were selected on 
Amp + Kan plates. The lacZ gene of pSV-sacB-neo was then repaired by a 
PGR fragment made from the Intact lacZ gene using T and m' homology 
arms. The m' homology arm included a silent G to G change that created 
a BamHl site. The recombinants, termed pSVpaXI', were identified by 
counter selection against the sacB gene using 7% sucrose, b B- 
galactosidase expression from pSVpaX 1 was disrupted in pSV-sacB-neo and 
restored in pSVpaXl*. Expression was analyzed on X-gal plates. Three 
independent colonies of each pSV-sacB-neo and pSVpaXl* are shown, c 
Ethidium bromide stained agarose gels of BamHl digested DNA prepared 
from independent colonies taken after counter selection with sucrose. All 
a-galactosidase expressing colonies (blue) contained the introduced BamH 1 
restriction site (upper panel). All white colonies displayed large 



wo 99/29837 



PCT/EP98/07945 



-26- 

rearrangements and no product carried the diagnostic 1.5kb BamHI 
restriction fragment (lower panel). 

Figure 1 1 

Transferance of ET cloning into a recBC-l- host to modify a large episome. 
a Scheme of the plasmid, pBAD-ETk, which carries the mobile ET system, 
and the strategy employed to target the Hoxa PI episome. pBAD-ETK is 
based on pBAD24 and includes (i) the truncated recE gene (t-recE> under 
the arabinose-inducible Pb^o promoter; (ii) the recT gene under the EM7 
promoter; and (iii) the redK gene under the Tn5 promoter. It was 
transformed into NS31 45. a recA E.coli strain which contained the Hoxa PI 
episome. After arabinose induction, competent cells were prepared and 
transformed with a PGR product carrying the chloramphenicol resistance 
gene (cm) flanked by n and p homology arms, n and p were chosen to 
recombine with a segment of the PI vector, b Southern blots of Pvu II 
digested DNAs hybridized with a probe made from the PI vector to visualize 
the recombination target site (upper panel) and a probe made from the 
chloramphenicol resistance gene (lower panel). Lane 1 , DNA prepared from 
cells harboring the Hoxa PI episome before ET cloning. Lanes 2-17, DNA 
prepared from 16 independent chloramphenicol resistant colonies. 

Figure 12 

Comparison of ET cloning using the recE/recT genes in pBAD-ETk with 
reda/redR genes in pBAD-aBK- 

The piasmids pBAD-ETKor pBAD-aBy. depicted, were transformed into the 
E.coli recA-, recBC+ strain, DKl and targeted by a chloramphenicol gene 
as described in Fig.6 to evaluate ET cloning efficiencies. Arabinose 
induction of protein expression was for 1 hour. 
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Figure 13A 

The plasmid pBAD-ETK is shown by diagram. 
Figure 13B 

The nucleic acid sequence and the protein coding portions of pBAD-ETKare 
depicted. 

Figure 14A 

The plasmid pBAD-aB/ is shown by diagram. This plasmid substantially 
corresponds to the plasmid shown in Fig.1 3 except that the recE and recT 
genes are substituted by the reda and redB genes. 

Figure 14B 

The nucleic acid sequence and the protein coding portions of pBAD-aBKare 
depicted. 

1 . Methods 

1.1. Preparation of linear fragments 

Standard PGR reaction conditions were used to amplify linear DNA 
fragments. The sequences of the primers used are depicted in Table 1 . 



Table 1 



The Tn5-neo gene from pJP5603 (Penfold and Pemberton, Gene 118 
(1992), 145-146) was amplified by using oligo pairs a/b and c/d. The 
chloramphenicol (cm) resistant gene from pMAK705 (Hashimoto-Gotoh and 
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Sekiguchi, J.Bacteriol.131 (1977), 405-412) was amplified by using primer 
pairs e/f and n/p. The Tn5-neo gene flanked by FRT or loxP sites was 
amplified from pKaZ or pKaX (http://www.embl-heidelberg.de/Externallnfo 
/Stewart) using oligo pairs i/h, g/h and j7k. The sacB-neo cassette from 
plB279(Blomfield etal., Mol. Microbiol. 5 (1991), 1447-1457) was amplified 
by using oligo pair l/m. The lacZ gene fragment from pSVpaZ1 1 (Buchholz 
et a!., Nucleic Acids Res.24 (1996), 4256-4262) was amplified using oligo 
pair l7m\ PGR products were purified using the QIAGEN PGR Purification 
Kit and eluted with H2O2, followed by digestion of any residual template 
DNA with Dpn I. After digestion, PGR products were extracted once with 
Phenol:CHCl3, ethanol precipitated and resuspended in HjO at approximately 
0.5 A/g//^l. 

1.2 Preparation of competent cells and electroporation 

Saturated overnight cultures were diluted 50 fold into LB medium, grown 
to an OD600 of 0.5, following by chilling on ice for 15 min. Bacterial cells 
were centrifuged at 7,000 rpm for 10 min at 0°G. The pellet was 
resuspended in ice-cold 10% glycerol and centrifuged again (7,000 rpm, 
-5°G, 10 min). This was repeated twice more and the cell pellet was 
suspended in an equal volume of ice-cold 10% glycerol, Aliquots of 50 /jl 
were frozen in liquid nitrogen and stored at -80°C. Cells were thawed on 
ice and 1 fj\ DNA solution (containing, for co-transformation, 0.3 fjg plasmid 
and 0.2 //g PGR products; or, for transformation, 0.2 //g PGR products) was 
added. Electroporation was performed using ice-cold cuvettes and a Eio-Rad 
Gene Puiser set to 25 /yFD, 2.3 kV with Pulse Controller set at 200 ohms. 
LB medium (1 ml) was added after electroporation. The cells were incubated 
at 37 °C for 1 hour with shaking and then spread on antibiotic plates. 

1 .3 Induction of RecE and RecT expression 
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E.coli JC5547 carrying pBAD24-recET was cultured overnight in LB medium 
plus 0.2% glucose, 100//g/ml ampicillin. Five parallel LB cultures, one of 
which (0) included 0.2% glucose, were started by a 1/100 inoculation. The 
cultures were incubated at 37 with shaking for 4 hours and 0.1% L- 
arabinose was added 3, 2, 1 or 1/2 hour before harvesting and processing 
as above. Immediately before harvesting, 100/;! was removed for analysis 
on a 10% SDS-poiyacrylamide gel. E.coli NS3145 carrying Hoxa-PI and 
pBAD-ETk was induced by 0. 1 % L-arabinose for 90 min before harvesting. 

1 .4 Transient transformation of FLP and Cre expression plasmids 

The FLP and Cre expression plasmids, 705-Cre and 705-FLP (Buchholz et 
al, Nucleic Acids Res. 24 (1996), 3118-3119), based on the pSClOl 
temperature sensitive origin, were transformed into rubidium chloride 
competent bacterial cells. Cells were spread on 25//g/ml chloramphenicol 
plates, and grown for 2 days at 30°C, whereupon colonies were picked, 
replated on L-agar plates without any antibiotics and incubated at 40**C 
overnight. Single colonies were analyzed on various antibiotic plates and all 
showed the expected loss of chloramphenicol and kanamycin resistance. 

1.5 Sucrose counter selection of sacB expression 

The E.coli JC9604lacZ strain, generated as described in Fig.11, was 
cotransformed with a sacB-neo PGR fragment and pSVpaXl (Buchholz et 
al, Nucleic Acids Res. 24 (1 996), 4256-4262) . After selection on 1 00 //g/ml 
ampicillin, 50 //g/ml kanamycin plates, pSVpaX-sacB-neo plasmids were 
isolated and cotransformed into fresh JC9604lacZ cells with a PGR 
fragment amplified from pSVpaXI using primers T/m*. Oligo m* carried a 
silent point mutation which generated a BamHI site. Cells were plated on 
7% sucrose, 100 //g/ml ampicillin, 40 //g/ml X-gal plates and incubated at 
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28^C for 2 days. The blue and white colonies grown on sucrose plates 
were counted and further checked by restriction analysis. 

1.6 Other methods 



DNA preparation and Southern analysis were performed according to 
standard procedures. Hybridization probes were generated by random 
priming of fragments isolated from the Tn5 neo gene (Pvull), Hoxa3 gene 
(both Hindlll fragments), lacZ genes (EcoRl and BamHI fragments from 
10 pSVpaXI), cm gene (BstBI fragments from pMAK705) and PI vector 
fragments (2.2 kb EcoRI fragments from PI vector). 

2. Results 

15 2.1 Identification of recombination events in E.coll 

To identify a flexible homologous recombination reaction in E.coli, an assay 
based on recombination between linear and circular DNAs was designed 
(Fig. 1 , Fig. 3) . Linear DNA carrying the Tn5 kanamycin resistance gene (neo) 

20 was made by PCR (Fig. 3a). Initially, the oligonucleotides used for PGR 
amplification of neo were 60mers consisting of 42 nucleotides at their 5' 
ends identical to chosen regions in the plasmid and, at the 3' ends, 18 
nucleotides to serve as PCR primers. Linear and circular DNAs were mixed 
in equimolar proportions and co-transformed into a variety of E.coli hosts. 

25 Homologous recombination was only detected in sbcA E.coli hosts. More 
than 95% of double ampicillin/kanamycin resistant colonies (Fig. 3b) 
contained the expected homologously recombined plasmid as determined 
by restriction digestion and sequencing. Only a low background of 
kanamycin resistance, due to genomic integration of the neo gene, was 

:io apparent (not shown). 
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The linear plus circular recombination reaction was characterized in two 
ways. The relationship betweeen homology arm length and recombination 
efficiency was simple, with longer arms recombining more efficiently 
(Fig. 3c). Efficiency increased within the range tested, up to 60 bp. The 
effect of distance between the two chosen homology sites in the recipient 
plasmid was examined (Fig. 3d). A set of eight PGR fragments was 
generated by use of a constant left homology arm with differing right 
homology arms. The right homology arms were chosen from the plasmid 
sequence to be 0 - 3100 bp from the left. Correct products were readily 
obtained from all, with less than 4 fold difference between them, although 
the insertional product (0) was least efficient. Correct products also 
depended on the presence of both homology arms, since PCR fragments 
containing only one arm failed to work. 

2.2 Involvement of RecE and Reel 

The relationship between hostgenotype and this homologous recombination 
reaction was more systematically examined using a panel of E.coli strains 
deficient in various recombination components (Table 2). 

Table 2 

Only the two sbcA strains, JC8679 and JC9604 presented the intended 
recombination products and RecA was not required. In sbcA strains, 
expression of RecE and RecT is activated. Dependence on recE can be 
inferred from comparison of JC8679 with JC8691. Notably no 
recombination products were observed in JC9387 suggesting that the 
sbcBC background is not capable of supporting homologous recombination 
based on 50 nucleotide homology arms. 

To demonstrate that RecE and RecT are involved, part of the recET operon 
was cloned into an inducible expression vector to create pBAD24-recET 
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(Fig. 6a). the recE gene was truncated at its N-terminal end, as the first 588 
a.a.s of RecE are dispensable. The recBC strain, JC5547, was transformed 
with pBAD24-recET and a time course of RecE/RecT induction performed 
by adding arabinose to the culture media at various times before harvesting 
for competent cells. The batches of harvested competent cells were 
evaluated for protein expression by gel electrophoresis (Fig. 6b) and for 
recombination between a linear DNA fragment and the endogenous 
pBAD24-recET plasmid (Fig. 6c). Without induction of RecE/RecT, no 
recombinant products were found, whereas recombination increased in 
approximate concordance with increased RecE/RecT expression. This 
experiment also shows that co-transformation of linear and circular DNAs 
is not essential and the circular recipient can be endogenous in the host. 
From the results shown in Figs. 3, 6 and Table 2, we conclude that RecE 
and RecT mediate a very useful homologous recombination reaction in 
recBC E.coli at workable frequencies. Since RecE and RecT are involved, we 
refer to this way of recombining linear and circular DNA fragments as "ET 
cloning". 

2.3 Appiication of ET cloning to large target DNAs 

To show that large DNA episomes could be manipulated in E.coli, a > 76 
kb PI clone that contains at least 59 kb of the intact mouse Hoxa complex, 
(confirmed by DNA sequencing and Southern blotting), was transferred to 
an E.coli strain having an sbcA background {JC9604) and subjected to two 
rounds of ET cloning. In the first round, the Tn903 kanamycin resistance 
gene.resident in the PI vector was replaced by an ampiciilin resistance gene 
(Fig. 4). In the second round, the interval between the Hoxa3 and a4 genes 
was targeted either by inserting the neo gene between two base pairs 
upstream of the Hoxa3 proximal promoter, or by deleting 6203 bp between 
the Hoxa3 and a4 genes (Fig. 8a). Both insertionai and deletional ET cloning 
products were readily obtained (Fig. 8b, lanes 2, 3 and 5) showing that the 



wo 99/29837 



PCT/EP98/07945 



- 33 - 

two rounds of ET cloning took place in this large E.coli episome with 
precision and no apparent unintended recombination. 

The general applicability of ET cloning was further examined by targeting 
a gene in the E.coli chromosome (Fig. 9a). The S-galactosidase (lacZ) gene 
of JC9604 was chosen so that the ratio between correct and incorrect 
recombinants could be determined by evaluating (S-galactosidase 
expression. Standard conditions (0.2 PGR fragment; 50 //I competent 
cells), produced 24 primary colonies, 20 of which were correct as 
determined by B-galactosidase expression (Fig. 9b), and DNA analysis 
(Fig. 9c, lanes 3-6). 

2.4 Secondary recombination reactions to remove operational sequences 

The products of ET cloning as described above are limited by the necessary 
inclusion of selectable marker genes. Two different ways to use a further 
recombination step to remove this limitation were developed. In the first 
way, site specific recombination mediated by either Flp or Cre recombinase 
was employed. In the experiments of Figs. 8 and 9, either Flp recombination 
target sites (FRTs) or Cre recombination target sites (loxPs) were included 
to flank the neo gene in the linear substrates. Recombination between the 
FRTs or loxPs was accomplished by Flp or Cre, respectively, expressed from 
plasmids with the pSCIOI temperature sensitive replication origin 
(Hashimoto-Gotoh and Sekiguchi, J.Bacteriol. 131 (1977), 405-412) to 
permit simple elimination of these plasmids after site specific recombination 
by temperature shift. The precisely recombined Hoxa PI vector was 
recovered after both ET and Flp recombination with no other recombination 
products apparent (Fig. 8, lanes 4 and 6). Similarly, Cre recombinase 
precisely recombined the targeted lacZ allele (Fig. 9, lanes 7-10). Thus site 
specific recombination can be readily coupled with ET cloning to remove 
operational sequences and leave a 34 bp site specific recombination target 
site at the point of DNA manipulation. 
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In the second way to renriove the selectable marker gene, two rounds of ET 
cloning, combining positive and counter selection steps, were used to leave 
the DNA product free of any operational sequences (Fig. 10a). 

Additionally this experiment was designed to evaluate, by a functional test 
based on (5-galactosidase activity, whether ET cloning promoted small 
mutations such as frame shift or point mutations within the region being 
manipulated. In the first round, the lacZ gene of pSVpaXI was disrupted 
with a 3.3 kb PCR fragment carrying the neo and B.subtilis sacB (Blomfield 
et al., Mol. Microbiol. 5 (1991), 1447-1457) genes, by selection for 
kanamycin resistance (Fig. 10a). As shown above for other positively 
selected recombination products, virtually all selected colonies were white 
(Fig. 10b), indicative of successful lacZ disruption, and 17 of 17 were 
confirmed as correct recombinants by DNA analysis. In the second round, 
a 1 .5 kb PCR fragment designed to repair lacZ was introduced by counter 
selection against the sacB gene. Repair of lacZ included a silent point 
mutation to create a BamHI restriction site. Approximately one quarter of 
sucrose resistant colonies expressed IS-galactosidase, and all analyzed (17 
of 17; Fig. 10c) carried the repaired lacZ gene with the BamHI point 
mutation. The remaining three quarters of sucrose resistant colonies did not 
express B-galactosidase, and all analyzed (17 of 17; Fig. 10c) had 
undergone a variety of large mutational events, none of which resembled 
the ET cloning product. Thus, in two rounds of ET cloning directed at the 
lacZ gene, no disturbances of B-galactosidase activity by small mutations 
were observed, indicating the RecE/RecT recombination works with high 
fidelity. The significant presence of incorrect products observed in the 
counter selection step is an inherent limitation of the use of counter 
selection, since any mutation that ablates expression of the counter 
selection gene will be selected. Notably, all incorrect products were large 
mutations and therefore easily distinguished from the correct ET product by 
DNA analysis. In a different experiment (Fig .5), we observed that ET cloning 
into pZero2. 1 (InVitroGen) by counter selection against the ccdB gene gave 
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a lower background of incorrect products (8%), indicating that the counter 
selection baclcground is variable according to parameters that differ from 
those that influence ET cloning efficiencies. 

2.5 Transference of ET cloning between E.coli hosts 

The experiments shown above were performed in recBC- E.coli hosts since 
the sbcA mutation had been identified as a suppressor of recBC (Barbour 
at al., Proc.Natl.Acad.Sci. USA 67 (1970), 128-135; Clark, Genetics 78 
(1974), 259-271). However, many useful E.coli strains are recBC-f, 
including strains commonly used for propagation of PI, BAG or PAC 
episomes. To transfer ET cloning into recBC + strains, we developed pBAD- 
ETk and pBAD-af^K (Figs. 13 and 14). These plasmids incorporate three 
features important to the mobility of ET cloning. First, RecBC is the major 
E.coli exonuclease and degrades introduced linear fragments. Therefore the 
RecBC inhibitor, RedK (Murphy, J.Bacteriol. 173 (1991), 5808-5821), was 
included. Second, the recombinogenic potential of RecE/RecT, or 
Reda/RedB, was regulated by placing recE or reda under an inducible 
promoter. Consequently ET cloning can be induced when required and 
undesired recombination events which are restricted at other times. Third, 
we observed that ET cloning efficiencies are enhanced when RecT, or RediS, 
but not RecE, or Reda, is overexpressed. Therefore we placed recT, or redfi, 
under the strong, constitutive, EM7 promoter. 

pBAD-ETk was transformed into NS3 1 45 E.coli harboring the original Hoxa 
PI episome (Fig.1 la). A region in the PI vector backbone was targeted by 
PGR amplification of the chloramphenicol resistance gene (cm) flanked by 
n and p homology arms. As described above for positively selected ET 
cloning reactions, most (> 90%) chloramphenicol resistant colonies were 
correct. Notably, the overall efficiency of ET cloning, in terms of linear DNA 
transformed, was nearly three times better using pBAD-ETK than with 
similar experiments based on targeting the same episome in the sbcA host, 



wo 99/29837 




PCT/EP98/07945 



-36- 

JC9604. This is consistent with our observation that overexpression of 
RecT improves ET cloning efficiencies. 

A comparison between ET cloning efficiencies mediated by RecE/RecT, 
5 expressed from pBAD-ETj^, and Reda/Red(S, expressed from pBAD-aBK was 
made in the recA-, recBC+ E.coli strain, DK1 (Fig. 12). After transformation 
of E.coli DK1 with either pBAD-ETk or pBAD-aBK* the same experiment as 
described in Figure 6a, c, to replace the bla gene of the pBAD vector with 
a chloramphenicol gene was performed. Both pBAD-ETk or pBAD-aB^ 
10 presented similar ET cloning efficiencies in terms of responsiveness to 
arabinose induction of RecE and Reda, and number of targeted events. 
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E.coli 

Strains 


GenotVDes 


Amp+Kan 


Amp 








X 10 V 


JC8679 


recBC sbcA 


318 


2.30 


JC9604 


recA recBC sbcA 


114 


0.30 


JC8691 


recBC sbcA recE 


0 


0.37 


JC5547 


recA recBC 


0 


0.37 


JC5519 


recBC 


0 


1.80 


JC15329 


recA recBC sbcBC 


0 


0.03 


JC9387 


recBC sbcBC 


0 


2.20 


JC8111 


recBC sbcBC recF 


0 


2.40 


JC9366 


recA 


0 


0.37 


JC13031 


recJ 


0 


0.45 
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Claims 



5 
10 

15 2. 
3. 

20 

4. 

25 5. 

6. 

30 



A method for cloning DNA molecules in cells comprising the steps of: 

a) providing a host cell capable of performing homologous 
recombination, 

b) contacting in said host cell a first DNA molecule which is 
capable of being replicated in said host cell with a second 
DNA molecule comprising at least two regions of sequence 
homology to regions on the first DNA molecule, under 
conditions which favour homologous recombination between 
said first and second DNA molecules and 

c) selecting a host cell in which homologous recombination 
between said first and second DNA molecules has occurred. 

The method according to claim 1 wherein the homologous 
recombination occurs via the recET cloning mechanism. 

The method according to claim 2 wherein the host cell is capable of 
expressing recE and recT genes. 

The method according to claim 3 wherein the recE and recT genes 
are selected from E.coli recE and recT genes or from A reda and redS 
genes. 

The method according to claim 3 or 4 wherein the host cell is 
transformed with at least one vector capable of expressing recE 
and/or recT genes. 

The method of claim 3, 4 or 5 wherein the expression of the recE 
and/or recT genes is under control of a regulatable promoter. 



wo 99/29837 



PCT/EP98/07945 



- 39 - 



8, 

5 
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15 9. 



20 



The method of claim 5 or 6 wherein the recT gene is overexpressed 
versus the recE gene. 

The method according to any one of claims 3 to 7 wherein the recE 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 1320 (ATG) to 2159 
(GAC) as depicted in Fig.7B, 

(b) the nucleic acid sequence from position 1320 (ATG) to 1998 
(CGA) as depicted in Fig.13B, 

(c) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequence from (a), (b) and/or (c). 

The method according to any one of claims 3 to 8 wherein the recT 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 2155 (ATG) to 2961 
(GAA) as depicted in Fig.7B, 

(b) the nucleic acid sequence from position 2086 (ATG) to 2868 
(GCA) as depicted in Fig. 138, 

(c) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequences from (a), (b) and/or (c). 

The method according to any one of the previous claims wherein the 
host cell is a gram-negative bacterial cell. 

The method according to claim 10 wherein the host cell is an 
Escherichia coli cell. 
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The method according to claim 1 1 wherein the host cell is an 
Escherichia coli K12 strain. 

The method according to claim 12 wherein the E.coli strain is 
selected from JC 8679 and JC 9604. 

The method according to any one of the previous claims wherein the 
host cell further is capable of expressing a recBC inhibitor gene. 

The method according to claim 14 wherein the host cell is 
transformed with a vector expressing the recBC inhibitor gene. 

The method according to claim 14 or 1 5 wherein the recBC inhibitor 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 3588 (ATG) to 4002 
(GTA) as depicted in Fig. 138, 

(b) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(c) a nucleic acid sequence which hybridizes under stringent 
conditions (as defined above) with the nucleic acid sequence from (a) 
and/ or (b). 

The method according to any one of claims 13 to 16 wherein the 
host cell is a prokaryotic recBC+ cell. 

The method according to any one of the previous claims wherein the 
first DNA molecule is circular. 

The method according to any one of the previous claims wherein the 
first DNA molecule is an extrachromosomal DNA molecule containing 
an origin of replication which is operative in the host cell. 
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20. The method according to claim 18 or 19 wherein the first DNA 
molecule is selected from plasmids, cosmids, PI vectors, BAG 
vectors and PAC vectors. 

21. The method according to any one of claims 1-18 wherein the first 
DNA molecule is a host cell chromosome, 

22. The method according to any one of the previous claims wherein the 
second DNA molecule is linear. 

23. The method according to any one of the previous claims wherein the 
regions of sequence homology are at least 15 nucleotides each. 

24. The method according to one of claims 1 to 16 wherein the second 
DNA molecule is obtained by an amplification reaction. 

25. The method according to one of the previous claims wherein the first 
and/or second DNA molecules are introduced into the host cells by 
transformation. 

26. The method according to claim 25 wherein the transformation 
method is eiectroporation. 

27. The method according to one of claims 1 to 26 wherein the first and 
second DNA molecules are introduced into the host cell 
simultaneously by co-transformation. 

28. The method according to one of claims 1 to 26 wherein the second 
DNA molecule is introduced into a host cell in which the first DNA 
molecule is already present. 
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29. 
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34. 



The method according to one of the previous claims wherein the 
second DNA molecule contains at least one marker gene placed 
between the two regions of sequence homology and wherein 
homologous recombination is detected by expression of said marker 
gene. 

The method according to claim 29 wherein gene presence is selected 
from antibiotic resistance genes, deficiency complementation genes 
and reporter genes. 

the method of any one of claims 1 to 30 wherein the first DNA 
molecule contains at least one marker gene between the two regions 
of sequence homology and wherein homologous recombination is 
detected by lack of expression of said marker gene. 

The method of any one of claims 1 to 31 wherein said marker gene 
is selected from genes which, under selected conditions, convey a 
toxic or bacteriostatic effect on the cell, and reporter genes. 

A method according to any one of the previous claims wherein the 
first DNA molecule contains at least one target site for a site specific 
recombinase between the two regions of sequence homology and 
wherein homologous recombination is detected by removal of said 
target site. 

A method for cloning DNA molecules comprising the steps of: 
(a) providing a source of RecE and RecT proteins, 
lb) contacting a first DNA molecule which is capable of being 
replicated in a suitable host cell with a second DNA molecule 
comprising at least two regions of sequence homology to regions on 
the first DNA molecule, under conditions which favour homologous 
recombination between said first and second DMA molecules and 
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(c) selecting DNA molecules in which homologous recombination 
between said first and second DNA molecules has occurred. 

The method of claim 34 wherein said RecE and RecT or proteins are 
selected from E.coli RecE and RecT proteins or from phage A Reda 
and Redfi proteins. 

The method of claim 34 or 35 wherein the recombination occurs in 
vitro. 

The method of claim 34 or 35 wherein the recombination occurs In 
vivo. 

Use of cells capable of expressing the recE and recT genes as a host 
cell for a cloning method involving homologous recombination. 

Use of a vector system capable of expressing recE and recT genes 
in a host cell for a cloning method involving homologous 
recombination. 

Use of claims 38 or 39 wherein the recE and recT genes are selected 
from E.coli recE and recT genes or from A reda and redlS genes. 

Use of a source of RecE and RecT proteins for a cloning method 
involving homologous recombination. 

Use of claim 41 wherein said RecE and RecT or proteins are selected 
from E.coli RecE and RecT proteins or from phage A Reda and RedB 
proteins. 
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A reagent kit for cloning connprising 

(a) a host cell 

(b) means of expressing recE and recT genes in said host cell and 

(c) a recipient cloning vehicle capable of being replicated in said cell. 

The reagent kit according to claim 43 wherein the means (b) 
comprise a vector system capable of expressing the recE and recT 
genes in the host ceil. 

The reagent kit according to claim 43 or 44 wherein the recE and 
recT genes are selected from E.coli recE and recT genes or from X 
redt7 and redl^ genes. 

A reagent kit for cloning comprising 

(a) a source for RecE and RecT proteins and 

(b) a recipient cloning vehicle capable of being propagated in a host 
cell. 

The reagent kit according to claim 46 further comprising a host cell 
suitable for propagating said recipient cloning vehicle. 

The reagent kit according to claim 46 or 47 wherein said RecE and 
RecT or proteins are selected from E.coli RecE and RecT proteins or 
from phage X Reda and RedB proteins. 

The reagent kit according to any one of claims 43-48 further 
comprising means for expressing a site specific recombinase in said 
host cell. 

The reagent kit according to any one of claims 43-49 further 
comprising nucleic acid amplification primers comprising a region of 
homology to said recipient cloning vehicle. 
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Figure 7b 
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252^Asn 


A rg 


Gly 


Val 


Thr 


Ala 


i le 


Pro 


Met 


A rg 


Thr 


252 GGT 


GOT 


CAA 


AAG 


CAG 


Ci'i' 


CGC 


CTG 


GCT 


GAT 


ACG 


241<Thr 


Ser 


Leu 


Leu 


Leu 


Lys Ala 


Gl n 


Ser 


lie A rg 


285 TTG 


GTC 


CTC 


GCG 


CCA 


GCT 


TAA 


GAC 


GCT 


AAT 


CCC 


230^ Gin 


Asp 


Glu 


A rg 


Trp 


Ser 


Leu 


Val 


Ser 


1 1 e 


Gly 


318 TAA 


CTG 


CTG 


GCG 


GAA 


AAG 


ATG 


TGA 


CAG 


ACG 


CGA 


219^ Leu 


Gin 


Gl n A rg 


Phe 


Leu 


His 


Ser 


Leu Arg 


Ser 


351 CGG 


CGA 


CAA 


GCA 


AAC 


ATG 


CTG 


TGC 


GAC 


GCT 


GGC 


208^ Pro 


Ser 


Leu 


Cys 


Val 


HI s 


Gl n 


AI a 


Val 


Ser 


Ala 


EcoRV 




















384 GAT 


ATC 


AAA 


ATT 


GCT 


GTC 


TGC 


CAG 


GTG 


ATC 


GCT 


197^ 1 le Asp 


Phe 


Asn 


Ser 


Asp Ai a 


Leu 


His 


Asp 


Ser 


417 GAT 


GTA 


CTG 


ACA 


AGC 


CTC 


GCG 


TAC 


CCG 


ATT 


ATC 


186^ 1 le 


Tyr 


Gi n 


Cys 


Ala 


Gl u A rg 


Val 


A rg 


Asn 


Asp 
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Figure 7b (cont'd) 



4dU cat 


CGG 


TGG 


ATG 


GAG 


CGA 


CTC 


GTT 


AAT 


CGC 


TTC 


175^Met 


P ro 


Pro 


Hi s 


Leu 

ka W Wl 


Ser 


Gl u 


Oil 


1 1 e 


Ala 


Glu 


4oi CAT 


GCG 


CCG 


CAG 


TAA 


CAA 


i"l ■ 1 t/'l 

TTG 


CTC 


AAG 


CAG 


ATT 


164^Met 


A ra 


A ra 


Leu 

«B W Vi 


Leu 

W Wl 


Leu 

W Vl 


Gl n 


Gl u 


Lpu 

Lm* w U 


Leu 


Asn 


Dlo TAT 


CGC 


CAG 


CAG 


CTC 


^*t^^TV 

CGA 


ATA 


GCG 


ccc 


TTC 


CCC 


153^ 1 1 e 


Al a 


Leu 


Leu 

M w V4 


Gl u 


Ser 


Tvr 

1 y 1 


A rn 


Gl V 


Glu 


Gly 


549 TTG 


ccc 


GGC 


GTT 


AAT 


GAT 


TTG 


CCC 


AAA 


CAG 


GTC 


142 ^ GI n 


Gl V 


Al a 


Asn 


i i e 


1 1 e 


Gl n 


Gl V 


Phe 


Leu Asp 


582 GCT 


GAA 


ATG 


CGG 


CTG 


GTG 


CGC 


TTC 


ATC 


CGG 


GCG 


131^ Ser 


Phe 


Hi s 

1 II w> 


P ro 

1 1 w 


Gl n 


Hi s 


Al a 


Gl u 


Asp 


Pro A rg 


615 AAA 


GAA 


ccc 


CGT 


ATT 


GGC 


AAA 


TAT 


TGA 


CGG 


CCA 


120^ Phe 


Phe 


Gl V 


Thr 


A c n 






1 1 p 


Ser 


Pro 


Trp 


648 GTT 


AAG 


CCA 


TTC 


ATG 


CCA 


GTA 


GGC 


GCG 


CGG 


ACG 


10Q4 Asn 


1 PI 1 


T rn 


VJii u 


Hi Q 


T rn 
1 rp 


T\ir 

I yr 


A 1 a 


A rg 


Pro Arg 


681 AAA 


GTA 


AAC 


CCA 


CTG 


GTG 


ATA 


CCA 


TTC 


GCG 


AGC 




1 yr 


Va 1 


T rn 

1 rp 




ni s 


1 yr 


1 rp 


Gl u A rg Ala 


714 CTC 


CGG 


ATG 


ACG 


ACC 


GTA 


GTG 


ATG 


AAT 


CTC 


TCC 


o / ^ di u 


P rn 

r ro 


Hi c 


A rn 

r\ rg 


vai y 


1 yr 


ni S 


Uli o 

ni s 


1 le 


Glu 


Gly 


747 TGG 


CGG 


GAA 


CAG 


CAA 


AAT 


ATC 


ACC 


CGG 


TCG 


GCA 


76^ P ro 


P ro 


Phe 


Leu 


Leu 


1 1 e 


Asp 


Gl y 


Pro Arg 


Cys 


780 AAC 


AAA 


TTC 


TCG 


TCC 


CTG 


ATT 


'i'lT 


CAC 


CAC 


CCC 


65^ Val 


Phe 


Glu 


A rg 


Gly 


Gin 


Asn 


Lys 


Val 


Val 


Gly 


813 CTG 


ACC 


GCG 


AAT 


GGT 


GAG 


ATT 


GAG 


AAT 


ATA 


ACC 


54 < Gin 


Gly 


A rg 


1 le 


Thr 


Leu 


Asn 


Leu 


1 le 


Tyr 


Gly 


846 TIT 


CAT 


TCC 


CAG 


CGG 


TCG 


GTC 


GAT 


AAA 


AAA 


ATC 


43^Lys 


Met 


Gly 


Leu 


Pro 


A rg 


Asp 


1 le 


Phe 


Phe Asp 
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Figure 7b (cont'd) 

879 GAG ATA ACC GTT GGC CTC AAT CX3G CGT TAA ACC 
32^ Leu Tyr Gl y Asn Ala Gl u lie Pro Thr Leu Gly 

912 CGC CAC GAG ATG GGC ATT AAA CGA GTA TCC CGG 
21^Ala Val Leu His Ala Asn Phe Ser Tyr Gly Pro 

945 GAG GAG GGG ATC ATT TTG CGC TTC AGC CAT 
10^ Leu Leu Pro Asp Asn Gin Ala Gl u Ala Met 

975 ACTTTTCATA CTCCCGCCAT TCAGAGAAGA AACCAATTGT 

1015 CCATATTGCA TCAGACATTG CCGTCACTGC GTCTTTTACT 

1055 GGCTCTTCTC GCTAACCAAA CCGGTAACCC CGCTTATTAA 

1095 AAGCATTCTG TAACAAAGCG GGACCAAAGC CATGACAAAA 

1135 ACGCGTAACA AAAGTGTCTA TAATCACGGC AGAAAAGTCC 

1175 ACATTGATTA TTTGCACGGC GTCACACTTT GCTATGCCAT 

BamHI 

1215 AGCATTTTTA TCCATAAGAT TAGCGGATCC TACCTGACGC 

1255 TTTTTATCGC AACTCTCTAC TGTTTCTCCA TACCCGTTTT 

Nhel EcoRI Ncol 

1295 TTTGGGCTAG CAGGAGGAAT TCACC ATG GAT CGC GTA 

l^Met Asp Pro Val 
1332 ATC GTA GAA GAG ATA GAG CCA GGT ATT TAT TAG 

S^lle Val Glu Asp lie Glu.Pro Gly lie Tyr Tyr 

1365 GGA ATT TCG AAT GAG AAT TAG CAC GCG GGT CCC 

ie>G\y Me Ser Asn Glu Asn Tyr His Ala Gly Pro 
1398 GGT ATC AGT AAG TCT GAG CTC GAT GAG ATT GOT 
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Figure 7b (cont'd) 

27>G\ y 1 1 e Ser Lys Ser 

1431 GAT ACT CCG GCA CTA 

38^Asp Thr Pro Ala Leu 

1464 GCC CCC GTG GAC ACC 

49^Ala Pro Val Asp Thr 

1497 GAT TTA GGA ACT GCT 

eO^Asp Leu Gly Thr Ala 

EcoRI 

1530 GAA CCG GAA GAA TTC 

71^Glu Pro Glu Gl u Phe 

1563 GCA CCT GAA TTT AAC 

82^Ala Pro Glu Phe Asn 

1596 AAA GAA GAA GAG AAA 

93^ Lys Glu Glu Glu Lys 

1629 GCA AGC ACA GGA AAA 

104^Ala Ser Thr Gly Lys 

1662 GAA GGC CGG AAA ATT 

115^ Glu Gly Arg Lys I le 



Gl n Leu Asp Asp lie Al a 

TAT TTG TGG CGT AAA AAT 

Tyr Leu Trp Arg Lys Asn 
ACA AAG ACA AAA ACG CTC 

Thr Lys Thr Lys Thr Leu 

TTC CAC TGC CGG GTA CTT 

Phe His Cys Arg Val Leu 
AGT AAC CGC TTT ATC GTA 

Ser Asn Arg Phe lie Val 

CGC CGT ACA AAC GCC GGA 

Arg Arg Thr Asn Ala Gly 
GCG TTT CTG ATG GAA TGC 

Ala Phe Leu Met Glu Cys 

ACG GTT ATC ACT GCG GAA 

Thr Val lie Thr Ala Glu 
GAA CTC ATG TAT CAA AGC 

Glu Leu Met Tyr Gin Ser 



SUBSTTIUTE SHEET (RULE 26) 



wo 99/29837 



• 

PCT/EP98/0794S 



16/65 

Figure 7b (cont'd) 
1695 GTT ATG GCT TIG CCG 

126^Val Met Ala Leu Pro 

1728 GAA AGC GCC GGA CAC 

137^Glu Ser Ala Gl y His 
1761 TGG GAA GAT CCT GAA 

148^Trp Gl u Asp Pro Gl u 
1794 TGC OGT CCG GAC AAA 

159^Cys Arg Pro Asp Lys 

1827 TGG ATC ATG GAC GTG 

170^Trp I le Met Asp Val 
1860 CAA CGA TTC AAA ACC 

181^ Gin Arg Phe Lys Thr 

1893 TAT CAC GTT CAG GAT 

192^Tyr HI s Va| Gl n Asp 
1926 TAT GAA GCA CAG TIT 

203^Tyr Gl u Ala Gin Phe 
1959 GTT TTT CTG GTT GCC 

214^ Val Phe Leu Val Ala 

1992 GGA CGT TAT CCG GTT 



CTG GGG CAA TGG CTT GTT 

Leu Gly Gin Trp Leu Val 

GCT GAA TCA TCA ATT TAC 

Al a Gl u Ser Ser I I e Tyr 
ACA GGA ATT TIG TGT CGG 

Thr Gly lie Leu Cys Arg 

ATT ATC CCT GAA TTT CAC 

Me lie Pro Gl u Phe His 

AAA ACT ACG GCG GAT ATT 

Lys Thr Thr Ala Asp lie 
GCT TAT TAC GAC TAC CGC 

Ala Tyr Tyr Asp Tyr Arg 

GCA TTC TAC AGT GAC GGT 

Ala Phe Tyr Ser Asp Gly* 
GGA GTG CAG CCA ACT TTC 

Gl y Val Gl n Pro Thr Phe 
AGC ACA ACT ATT GAA TGC 

Ser Thr Thr I I e Gl u Cys 

GAA ATP TTC ATG ATG GGC 
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Figure 7b (cont'd) 

225^Gly Arg Tyr Pro Val 
2025 GAA GAA GCA AAA CTG 

236^61 u Giu Ala Lys Leu 

2058 CAC CGC AAT CTG CGA 

247^ His Arg Asn Leu Arg 

Ball 

2091 AAT ACC GAT GAA TGG 

258^Asn Thr Asp Gl u Trp 
2124 TCA CTG CCC CGC TGG 

269^Ser Leu Pro Arg Trp 

2155 ATG ACT AAG CAA CCA 
l^Met Thr Lys Gin Pro 
279^s nAs p» • • 

2188 CTG CAA AAA ACT CAG 
12^Leu Gin Lys thr Gin 

2221 GCA GTT AAA AAT AGC 
23^Ala Val Lys Asn Ser 

2254 AAC CAG CCA TCA ATG 
34^Asn Gin Pro Ser Met 

Ndel 

2287 GCT CTT CCA CGC CAT 
45^Ala Leu Pro Arg His 



PCT/EP98/0794S 



VM U 


1 1 A 

1 i e 


Pne 


Met 


iviet 


Gl y 


GCA 


GGT 


CAA 


CAG 


GAA 


TAT 


Ai a 


Gl y 


Gl n 


Gl n 


Giu 


Tyr 


ACC 


CTG 


TCT 


GAC 


TGC 


CTG 


Thr 


Leu 


Ser 


Asp 


Cys 


Leu 


CCA 


GCT 


ATT 


AAG 


ACA 


TTA 


Pro 


Ala 


1 1 p 


1 V/ C 

Lys 


Thr 


Leu 


GCT 


AAG 


GAA 


TAT 


GCAA 


Ala 


Lys 


Gl u 


T V r 


AiaA 


CCA 


ATC 


GCA 


AAA 


GCC 


GAT 


Pro 


He 


Ala 


Lys 


AI a Asp 


GGA 


AAC 


CGT 


GCA 


CCA 


GCA 


Gl y Asn 


A rg 


Ai a 


Pro 


AI a 


GAC 


GTG 


ATT 


AGT 


TTT 


ATT 


Asp 


Val 


lie 


Ser 


Phe 


lie 


AAA 


GAG 


CAA 


CTG 


GCA 


GCA 


Lys 


Giu 


Gin 


Leu 


Ala 


Ala 


ATG 


ACG 


GCT 


GAA 


CGT 


ATG 


Met 


Thr 


Ai a 


Gl u 


Arg Met 
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Figure 7b (cont'd) 
2320 ATC CGT ATC GCC ACC 
56> 1 1 e Arg I I e Al a Thr 

2353 CCG GCG TTA GGA AAC 
67^Pro Ala Leu Gl y Asn 

2386 GTC ACT GCG ATC GTA 
78^ Val Ser Ala Me Val 

2419 CTT GAG CCA GGT AGC 
89^Leu Glu Pro Gl y Ser 

2452 TTA CTG CCT TIT GGT 
100^ Leu Leu Pro Phe Gl y 

2485 GGT AAA AAG AAC GTT 
lll^Gly Lys Lys Asn Val 

. 2518 CGC GGC ATG ATT GAT 
122 ►Arg Gl y Met I le Asp 

2551 CAA ATC GCC AGC CTG 
133^GI n I I e Al a Ser Leu 

2584 GAA GGT GAC GAG TTT 
144^ Glu Gly Asp Glu Phe 

2617 GAT GAA AAG TTA ATA 
155^Asp Glu Lys Leu Me 

2650 GAA GAT GCC CCG GTT 
166^ Glu Asp Ala Pro Val 

2683 GCA AGA CTG AAA GAC 
177^ Ala Arg Leu Lys Asp 

2716 GTT ATG ACG CGC AAA 
188^Val Met Thr A ra Lvs 



ACA GAA ATT CGT AAA GTT 
Thr Gl u II e Arg Lys Val 

TGT GAC ACT ATG AGT TTT 
Cys Asp Thr Met Ser Phe 

CAG TGT TCA CAG CTC GGA 
Gl n Cys Ser Gl n Leu Gl y 

GCC CTC GGT CAT GCA TAT 
Al a Leu Gl y HI s Al a Tyr 

AAT AAA AAC GAA AAG AGC 
Asn Lys Asn Glu Lys Ser 

CAG CTA ATC ATT GGC TAT 
Gin Leu Me Me Gly Tyr 

CTG GCT CGC CGT TCT GGT 
Leu Ala Arg Arg Ser Gly 

TCA GCC CGT GTT GTC CGT 
Ser Ala Arg Val Val Arg 

AGC TTC GAA TTT GGC CTT 
Ser Phe Glu Phe Gly Leu 

CAC CGC CCG GGA GAA AAC 
HI s A rg Pro Gl y Gl u Asn 

ACC CAC GTC TAT GCT GTC 
Thr His Val Tyr Ala Val 

GGA . GGT ACT CAG TTT GAA 
Gly Gly Thr Gl n Phe Glu 

CAG ATT GAG CTG GTG CGC 
Gin Me Gl u Leu Val Arg 
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Figure 7b (cont'd) 
2749 AGC CTG AGT AAA GCT GGT AAT AAC 
199^Ser Leu Ser Lys Ala Gl y Asn Asn 

2782 GTA ACT CAC TGG GAA GAA ATG GCA 
210^Val Thr His Trp Gl u Gl u Met Ala 

2815 GCT ATT CGT CGC CTG TIC AAA TAT 
221>Ala Me A rg Arg Leu Phe Lys Tyr 

2848 TCA ATT GAG ATC CAG CGT GCA GTA 
232^ Ser I I e Gl u I I e Gl n A rg Al a Val 

2881 GAA AAG GZ\A CCA CTG ACA ATC GAT 
243^Glu Lys Gl u Pro Leu Thr Me Asp 

2914 TCC TCT GTA TTA ACC GGG GAA TAC 
254^ Ser Ser Val Leu Thr Gl y Gl u Tyr 

Bglll Hindlll 
2947 GAT AAT TCA GAG GAA TAG ATCTAAGCTT 



265^Asp Asn Ser 


Gl u Gl u • * 






2975 


GGCTGTTTTG 


GCGGATGAGA 


GAAGATTTTC 


AGCCTGATAC 


3015 


AGATTA2\ATC 


AGAACGCAGA 


AGCGGTCTGA 


TAAAACAGAA 


3055 


TTTGCCTGGC 


GGCAOTAGCG 


CGGTGGTCCC 


ACCTGACCCC 


3095 


ATGCCGAACT 


CAGAAGTQAA 


ACGCCGTAGC 


GCCGATGGTA 


3135 


GTGTGGGGTC 


TCCCCATGCG 


AGAGTAGGGA 


ACTGCCAGGC 


3175 


ATCAAATAAA 


ACGAAAGGCT 


CAGTCQAAAG 


ACTGGGCCTT 


3215 


TCGTTTTATC 


TGTTGTTTGT 


CGGTGAACGC 


TCTCCTGAGT 


3255 


AGGACAAATC 


CGCCGGGAGC 


GGA'iTTGAAC 


GTTGCGAAGC 


3295 


AACGGCCCGG 


AGGGTGGCGG 


GCAGGACGCC 


CGCCATAAAC 


3335 


TGCCAGGCAT 


CAAATTAAGC 


AGAAGGCCAT 


CCTGACGGAT 



PCT/EP98/07945 



GGG CCG TGG 

Gly Pro Trp 

AAG AAA ACG 
Lys Lys Thr 

TTG CCC GTA 
Leu Pro Val 

TCA ATG GAT 
Ser Met Asp 
PstI 

CCT GCA GAT 
Pro Ala Asp 

AGT GTA ATC 
Ser Val lie 



SUBSTITUTE SHEET (RULE 26) 



wo 99/29837 PCT/EP98/07945 



20/65 

Figure 7b (cont'd) 

3375 GGCCTTTTTG CGTTTCTACA AACTCTTTTG TTTATTTTTC 

3415 TAAATACATT CAAATATGTA TCCGCTCATG AGACAATAAC 

3455 CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGT AT 

l^Me 



3495 G AGT ATT CAA CAT TTC CX3T GTC GCC CTT ATT 
l^t Ser lie Gin His PheArg Val Ala Leu lie 



3526 CCC 


TIT 


TTT 


GCG 


GCA, 


TTT 


TGC 


CTT 


CCT 


GTT 


TTT 


12^Pro 


Phe 


Phe 


Ala 


Ala 


Phe 


Cys 


Leu 


Pro 


Val 


Phe 


3559 GCT 


CAC 


CCA 


GAA 


ACG 


CTG 


GTG 


AAA 


GTA 


AAA 


GAT 


23^Ala 


His 


Pro 


Glu 


Thr 


Leu 


Val 


Lvs 


Val 


Lys Asp 


3592 GCT 


GAA 


GAT 


CAG 


TTG 


GGT 


GCA 


CGA 


GTG 


GGT 


TAC 




Gl u Asp 


Gin 


Leu 


Gly Ala 


A rg 


va 1 


Gly 


Tyr 


3625 ATC 


GAA 


CTG 


GAT 


CTC 


AAC 


AGC 


GGT 


AAG 


ATC 


CTT 


45 ► 1 1 e 


Giu 


Leu 


Asp 


Leu 


Asn 


Ser 


Gl y 


Lys 


i ie 


Leu 


3658 GAG 


AGT 


TTT 


CGC 


CCC 


GAA 


GAA 


CGT 


TTT 


CCA 


ATG 


56^ Gl u 


Ser 


Phe 


Arg 


Pro 


Glu 


Glu 


A rg 


Phe 


Pro 


Met 


3691 ATG 


AGC 


ACT 


TTT 


AAA 


GTT 


CTG 


CTA 


TGT 


GGC 


GCG 


67^Met 


Ser 


Thr 


Phe 


Ly s 


Val 


Leu 


Leu 


Cys 


Gly Ala 


3724 GTA 


TTA 


TCC 


CGT 


GIT 


GAC 


GCC 


GGG 


CAA 


GAG 


CAA 


78^ Val 


Leu 


Ser 


Arg 


Val 


Asp Al a 


Gly 


Gin 


Giu 


Gin 


3757 CTC 


GGT 


CGC 


CGC 


ATA 


CAC 


TAT 


TCT 


CAG 


AAT 


GAC 


89^ Leu 


Gly Arg 


Arg 


1 le 


His 


Tyr 


Ser 


Gin 


Asn Asp 






Scat 
















3790 TIG 


GTP 


GAG 


TAC 


TCA 


CCA 


GTC 


ACA 


GAA 


AAG 


CAT 


100^ Leu 


Val 


Gl u Tyr Ser 


Pro 


Val 


Thr 


Giu 


Lys 


His 


3823 CTT 


ACG 


GAT 


GGC 


ATG 


ACA 


GTA 


AGA 


GAA 


TTA 


TGC 


111^ Leu 


Thr 


Asp 


Gly 


Met 


Thr 


Val 


A rg 


Gl u 


Leu 


Cys 
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Figure 7b (cont'd) 













ATG 


AGT 


VaAT 


TV 

AAC 


ACT 


GCG 


122> Ser 


Al a 


Al a 


1 1 9 


Thr 


Met 


Ser 


A Qn 


A Qn 
r>o 1 1 


Thr 
1 1 1 1 


Al a 

r\ 1 d 






1 1 A 


/"VI f '71 




ACA 


ACG 


ATC 


GGA 


GGA 


CCG 


133^ Al a 


Asn 


Leu 

mim W W 


Leu 

W W 


Leu 

to W wl 


Thr 


Thr 


1 1 P 

1 1 K9 


Gl M 
\M Y 


Gly 


Pro 






CIA 


AUv^ 


GCT 


TTT 


TIG 


CAC 


AAC 


ATG 


GGG 


144 ► Lys 


Gl U 


Leu 


Thr 

III! 


Al a 


Phe 


Leu 


Hi <% 


A c n 

A^O 1 1 


Met 


Gly 




LAI 




Aui 


GGC 


CTT 


GAT 


OGT 


TGG 


GAA 


CCG 


155^ASD 


Hi s 


Vai 


Thr 


A rn 


Leu Asp 


A rn 


T rn 
1 rp 


Glu 


Pro 




L.iVj 


AAi 


/^7\ TV 


GCC 


ATA 


CCA 


AAC 


GAC 


GAG 


CGT 


166^ Gl u 


1 PIJ 

^ w ii 


A Q n 
oil 


Gl u 

wl U 


Al A 

r\ 1 CI 


lie 


Pro 


Aon 




Gl u A rg 


4U^J. GA.C 


ACC 


ACG 


ATG 


OCT 


GTA 


GCA 


ATG 


GCA 


ACA 


ACG 


177^ AsD 


Thr 
1 1 1 1 


Thr 
1 1 1 1 


Met 


P rn 


Val 


Ala 


IVI6 I 


A 1 o 
M 1 cl 


Thr 


Thr 


4054 TTG 


CGC 


IV ^ TV 

AAA 


CTA 


fill t*K 

TTA 


ACT 


GGC 


GAA 


CTA 


CTT 


ACT 


1 QQ^ 1 Pll 




1 \/ c 

Lys 


1 on 


1 Oil 


Thr 


Gly 


\M u 


1 All 

Leu 


Leu 


Thr 


4Uo7 CTA 


GCT 


TCC 


CGG 


CAA 


CAA 


TTA 


ATA 


GAC 


TGG 


ATG 






Ser 


A rg 


OA n 

v3l II 


Gin 


Leu 


1 1 e 


MSp 


Trp Met 


4120 GAG 


^^^^^^ 

GCG 


GAT 


AAA 


GTT 


GCA 


GGA 


CCA 


CTT 


CTG 


CGC 




Al » 


Asp 


Lys 


Vcl 1 


Ala 


Gly 


r ro 


1 A 1 1 

Leu 


Leu A rg 


4153 TCX3 


GCC 


err 


CCG 


GCT 


GGC 


TGG 


TTT 


ATT 


GCT 


GAT 


221^Ser 


Al a 


Leu 


Pro 


Ala 


Gly 


Trp 


Phe 


i le 


Al a Asp 


4186 AAA 


TCT 


GGA 


GCC 


GGT 


GAG 


CGT 


GGG 


TCT 


CGC 


GGT 


232 ►Lys 


Ser 


Gly Ala 


Gly 


Gl u A rg 


Gly 


Ser 


A rg 


Gly 


4219 ATC 


ATT 


GCA 


GCA 


CTG 


GGG 


CCA 


GAT 


GGT 


AAG 


CCC 


243^ i 1 e 


Me 


Ala 


Ala 


Leu 


Gly 


Pro 


Asp 


Gly 


Lys 


Pro 


4252 TCC 


CGT 


ATC 


GTA 


GTT 


ATC 


TAC 


ACG 


ACG 


GGG 


AGT 


254^ Ser 


A rg 


1 1 e 


Val 


Val 


lie 


Ty r 


Thr 


Thr 


Gly 


Ser 
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Figure 7b (cont'd) 

4285 CAG GCA ACT ATG GAT GAA CGA AAT AGA CAG ATC 
265^Gln Ala Thr Met Asp Gl u Arg Asn Arg Gin lie 

4318 GCT GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG 
276^Ala Glu lie Gl y Ala Ser Leu lie Lys His Trp 

4351 TAA CTGTCAGACC AAGTTTACTC ATATATACTT 



287^ 


• • • 








4384 


TAGATTGATT 




GrAGCvjf(jUUC 


A rrrpTV A^'V ^^rV2 


A A ^ A 

4424 


GCGGGTGTGG 


TGGITACGCG 


CAGCGJAjALL 




A A ^ A 

4464 


CCAGCGCCCT 


AGCGCCCGCT 


CCi i iXJGCTT 


TCT. 1 1 1 V- 


4504 


CTTTCTCGCC 


ACGjLMXJGCCG 


GC i 1 IX^CC^v^Vjy 




A tr A A 

4544 


AATCGGGGGC 


TCCCi i iAGG 


Gi i CCGAl i 1 


TV /^rrv~v ""M • 1 ■ 1 


A n A 

4584 


GGCACCTCGA 


/^O/^/^TV TV IV TV TV TV 

CCCCAAAAAA 


CTTGAi i IxjCj 


\jri\i/ii\iVji 


4624 


ACGTAGTGGG 


CCATCGCCCT 


GATAGACGGT 


TTTTCGCCCT 


4664 


TTGACGTTGG 


AGTCCACGTT 


CTTTAATAGT 


GGACTCTTGT 


4704 


TCCAAACTTG 


AACAACACTC 


AACCCTATCT 


CGGGCTATTC 


4744 


TTTTGATTTA 


TAAGGGATTT 


TGCCGATTTC 


GGCCTATTGG 


4784 


TTAAAAAATG 


AGCTGATTTA 


ACAAAAATTT 


AACGCGAATT 


4824 


TTAACAAAAT 


ATTAACGTTT 


ACAATTTAAA 


AGGATCTAGG 


4864 


TGAAGATCCT 


TTTTGATAAT 


CTCATGACCA 


AAATCCCTTA 


4904 


ACGTGAGTTT 


TCGTTCCACT 


GAGCGTCAGA 


CCCCGTAGAA 


4944 


AAGATCAAAG 


GATCTTCTTG 


AGATCCTTTT 


TTTCTGCGCG 


4984 


TAATCTGCTG 


CTTGCAAACA 


AAAAAACCAC 


CGCTACCAGC 


5024 


GGTGGTTTGT 


TTGCCGGATC 


AAGAGCTACC 


AACTCTTTTT 
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Figure 7b (cont'd) 
5064 CCGAAGGTAA 


CTGGCTTCAG 


CAGAGCGCAG 


ATACCAAATA 


5104 


CTGTCCTTCT 


AGTGTAGCCG 


TAGTTAGGCC 


ACCACTTCAA 


5144 


GAACTCTGTA 


GCACCGCCTA 


CATACCTCX3C 


TCTGCTAATC 


5184 


CTGTTACCAG 


TGGCTGCTGC 


CAGTGGCGAT 


AAGTCGTGTC 


5224 


TTACCGGGTT 


GGACTCAAGA 


CGATAGTTAC 


CGGATAAGGC 


5264 


GCAGCGGTCG 


GGCTGA?VCGG 


GGGGTTCGTG 


CACACAGCCC 


5304 


AGCTTGGAGC 


GAACGACCTA 


CACCGAACTG 


AGATACCTAC 


5344 


AGCGTGAGCT 


ATGAGAAAGC 


GCCACGCTTC 


CCGAAGGGAG 


5384 


AAAGGCGGAC 


AGGTATCCGG 


TAAGCGGCAG 


GGTCGGAACA 


5424 


GGAGAGCGCA 


CGAGGGAGCT 


TCCAGGGGGA 


AACGCCTGGT 


5464 


ATCTTTATAG 


TCCTGTCGGG 


TTTCGCCACC 


TCTGACTTGA 


5504 


GCGTCGATTT 


TTGTGATGCT 


CGTCAGGGGG 


GCGGAGCCTA 


5544 


TGGAAAAACG 


CCAGCAACGC 


GGCCTTTTTA 


CGGTTCCTGG 


5584 


CCTTTTGCTG 


GCCTTTTGCT 


CACATGTTCT 


TTCCTGCGTT 


5624 


ATCCCCTGAT 


TCTGTGGATA 


ACX^GTATTAC 


CGCCTTTGAG 

wm mm ^» ^^^^ ^^^^ 


5664 


TGAGCTGATA 


CCGCTCGCCG 


CAGCCGAACG 


ACCGAGCGCA 


5704 


GCGAGTCAGT 


GAGCGAGGAA 


GCGGAAGAGC 


GCCTGATGCG 


5744 


GTATTTTCTC 


CiiACGCATC 


TGTGCGGTAT 




5784 


ATAGGGTCAT 


GGCTGCGCCC 


CGACACCCGC 


CAACACCCGC 


5824 


TGACGCGCCC 


TGACGGGCTT 


GTCTGCTCCC 


GGCATCCGCT 


5864 


TACAGACAAG 


CTGTGACCGT 


CTCCGGGAGC 


TGCATGTGTC 


5904 


AGAGGTTTTC 


ACCGTCATCA 


CCGAAACGCG 


CGAGGCAGCA 


5944 


AGGAGATGGC 


GCCCAACAGT 


CCCCCGGCCA 


CGGGGCCTGC 
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Figure 7b (cont'd) 

5984 CACCATACCC ACGCCGAAAC AAGCGCTCAT GAGCCCGAAG 
6024 TGGCGAGCCC GATCTTCCCC ATCGGTGATG TCGGCGATAT 
6064 AGGCGCCAGC AACCGCACCT GTGGCGCCGG TGATGCCGGC 
6104 CACGATGCGT CCGGCX3TAGA GGATCTGCTC ATGTTTGACA 
6144 GCTTATC 
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Figure 8 b 
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Figure 9a 



Avarfzl ' ) Ava.(5015) 




lacZ 
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loxP 




loxP 



1 




Ava ! (3822) 
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Cre^ 
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Figure 9 
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Figure 10a 



lacZ 




pSV-sacB-neo 

bla 



Selection on Amp + Kan 




I* lacZ* m* 



1 



BamHI 

Counter-selection 
on Amp + 7% sucrose 



lacZ* m* 





1.5 kb-BamH I 



pSVpaXI 



bla 
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Figure 1 1 a 




SUBSTTTUTE SHEET (RULE 26) 




wo 99/29837 33 7:^5 PCT/EP9W0794S 

Figure lib 



1 2 3 4 5 6 7 8 9 1011 12 13141616 17 




SUBSTITUTE SHEET (RULE 26) 




0 50 100 150 200 250 (min) 
L-Ara Induction 



SUBSTTTUTE SHEET (RULE 26) 



wo 99/29837 



35 / 65 



• 

PCT/EP98A)7945 



Figure 1 3 a 



Seal 



BamHI 
Nhel 

EcoRI 
Ncol 

amHI 
coRI 
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Figure 13b 

1 ATCGATGCATAATGTGCCTGTCAAATGGACGAAGCAGGG 

40 ATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATT 

79 GTCTGATTCGTTACCAA TTA TGA CAA CTT GAG 

293^»»» Ser Leu Lys Val 

111 GGC TAG ATC ATT CAC TTT TTC TTC ACA ACC 
288^Ala Val Asp Asn Val Lys Gl u Gl u Cys Gl y 

141 GGC ACG GAA CTC GCT CGG GCT GGC CCC GGT 

278^Ala Arg Phe Gl u Ser Pro Ser Ala Gl y Thr 

171 GCA TTT TTT AAA TAG CCG CGA GAA ATA GAG 
268^Gys Lys Lys Phe Val Arg Ser Phe Tyr Leu 

201 TTG ATC GTC AAA ACC AAC ATT GCG ACC GAC 
258^ Gin Asp Asp Phe Gl y Val Asn Arg Gl y Val 

231 GGT GGC GAT AGG CAT CCG GGT GGT GCT CAA 
248^Thr Ala lie Pro Met Arg Thr Thr Ser Leu 

261 AAG GAG CTT GGC CTG GCT GAT ACG TTG GTC 
238^ Leu Leu Lys Al a Gl n Ser I I e A rg Gin Asp 

291 CTC GCG CCA GCT TAA GAC GCT AAT CCC TAA 
228<GIu Arg Trp Ser Leu Val Ser I I e Gl y Leu 

321 CTG CTG GCG GAA AAG ATG TGA GAG ACG CGA 
218^ Gin Gin Arg Phe Leu His Ser Leu Arg Ser 

351 CGG CGA CAA GCA AAC ATG CTG TGC GAC GCT 
208^ Pro Ser Leu Cys Val Hi s Gl n Ala Val Ser 

381 GGC GAT ATC AAA ATT GCT GTC TGC CAG GTG 
198<Ala lie Asp Phe Asn Ser Asp Ala Leu His 

411 ATC GCT GAT GTA CTG ACA AGG CTC GCG TAG 
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Figure 1 3b (cont'd) 



188^Asp 


Ser 


lie 


Tyr Gl n 


Cys Ala 


Glu 


Arg 


Val 


441 CCG 


ATT 


ATC 


CAT 


CGG 


TGG 


ATG 


GAG 


CGA 


CTC 


178^Arg 


Asn Asp 


Met 


Pro 


Pro 


His 


Leu 


Ser 


Gl u 


471 GTT 


AAT 


CGC 


TTC 


CAT 


GCG 


CCG 


CAG 


TAA 


CAA 


168^Asn 


1 1 e 


Al a 


Gl u 


Met 


Arg Arg 


Leu 


Leu 


Leu 


501 TTG 


CTC 


AAG 


CAG 


ATT 


TAT 


CGC 


CAG 


CAG 


CTC 


158^ Gin 


Gl u 


Leu 


Leu 


Asn 


lie 


Ala 


Leu 


Leu 


Gl u 


531 CGA 


ATA 


GCG 


CCC 


TTC 


CCC 


TTG 


CCC 


GGC 


GTT 


148^ Ser 


Tyr Arg 


Gly 


Giu 


Gl y 


Gin 


Gly 


Ai a 


Asn 


561 AAT 


GAT 


TTG 


CCC 


AAA 


CAG 


GTC 


GCT 


GAA 


ATG 


138^ lie 


1 ie 


Gl n 


Gly 


Phe 


Leu Asp 


Ser 


Phe 


His 


591 CX3G 


CTG 


GTG 


CGC 


TTC 


ATC 


CGG 


GCG 


AAA 


GAA 


128^ Pro 


Gi n 


His 


Ala 


Glu 


Asp 


Pro 


A rg 


Phe 


Phe 


621 CCC 


CGT 


ATT 


GGC 


AAA 


TAT 


TGA 


CGG 


CCA 


GTT 


118^ Gly 


Thr 


Asn 


Ala 


Phe 


1 1 e 


Ser 


Pro 


Trp Asn 


651 AAG 


CCA 


TTC 


ATG 


CCA 


GTA 


GGC 


GCG 


CGG 


ACG 


108^ Leu 


Trp 


Gl u. 


His 


Trp 


Tyr Al a 


A rg 


Pro Arg 


681 AAA 


GTA 


AAC 






GTG 


ATA 


CCA 


TTC 


GCG 


98^Phe 


Tyr Val 


Trp 


Gin 


Hi s 


Tyr 


Trp 


Gl u A rg 


711 AGC 


CTC 


CGG 


ATG 


ACG 


ACC 


GTA 


GTG 


ATG 


AAT 


88^Ala 


Gl u 


Pro 


His 


Arg 


Gl y 


Tyr 


His 


Hi s 


1 1 e 


741 CTC 


TCC 


TGG 


CGG 


GAA 


CAG 


CAA 


AAT 


ATC 


ACC 


78^ Gl u 


Gly 


Pro 


Pro 


Phe 


Leu 


Leu 


1 Ie 


Asp 


Gly 


771 CGG 


TCG 


GCA 


AAC 


AAA 


TTC 


TCG 


TCC 


CTG 


ATT 


68^ Pro 


A rg 


Cys 


Val 


Phe 


Gl u A rg 


Gly 


Gl n 


Asn 
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Figure 13b (cont'd) 



801 TIT 


CAC 


CAC 


CCC 


CTG 


ACC 


GCG 


AAT 


GGT 


GAG 


58^Lys 


Val 


Val 


Giy 


Gin 


Gl y A rg 


lie 


Thr 


Leu 


831 ATT 


GAG 


AAT 


ATA 


ACC 


TTT 


CAT 


TCC 


CAG 


CGG 


48^Asn 


Leu 


lie 


Tyr 


Gly 


Lys Met 


Gly 


Leu 


Pro 


861 TCG 


GTC 


GAT 


AAA 


AAA 


ATC 


GAG 


ATA 


ACC 


GTT 


38^Arg 


Asp 


lie 


Phe 


Phe 


Asp 


Leu 


Tyr 


Gl y Asn 


891 GGC 


CTC 


AAT 


CGG 


CGT 


TAA 


ACC 


CGC 


CAC 


CAG 


28^AI a 


Gl u 


1 le 


Pro 


Thr 


Leu 


Gly 


Ala 


Val 


Leu 


921 ATG 


GGC 


ATT 


AAA 


CGA 


GTA 


TCC 


CGG 


CAG 


CAG 


18^ His 


Ala 


Asn 


Phe 


Ser 


Tyr 


Gly 


Pro 


Leu 


Leu 


951 GGG 


ATC 


ATT 


TTG 


CGC 


TTC 


AGC 


CAT 


ACTTTTC 


8^ Pro 


Asp 


Asn 


Gl n 


Ala 


Gl u 


Ala 


Met 







982 ATACTCCCGCCATTCAGAGAAGAAACCAATTGTCCATAT 



1021 TGCATCAGACATTGCCGTCACTGCGTCTTTTACTGGCTC 



1060 TTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGC 



1099 ATTCTGTAACAAAGCGGGACCAAAGCCATGACAAAAACG 



1138 CGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCAC 



1177 ATTGATTATTTGCACGGCGTCACACTTTGCTATGCCATA 



BamH! 

1216 GCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGC 
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Figure 1 3b (cont'd) 

1255 TTTTTATCGCAACTCICTACrci^^ 

Nhel EcoRI Ncol BamHI 

1294 TTTTGGGCTAGCAGGAGGAAT TCACC i^G GAT CCC 

l^Met Asp Pro 

1329 GTA ATC GTA GAA GAC ATA GAG CCA GGT ATT 
4^Val I le Val Gl u Asp lie Gl u Pro Gl y lie 

1359 TAT TAC GGA ATT TCG AAT GAG AAT TAC CAC 
14^Tyr Tyr Gl y Me Ser Asn Gl u Asn Tyr His 

1389 GCG GGT CCC GGT ATC AGT AAG TCT GAG CTC 
24^Ala Gly Pro Gly lie Ser Lys Ser Gin Leu 

1419 GAT GAC ATT GCT GAT ACT CCG GCA CTA TAT 
34^Asp Asp lie Ala Asp Thr Pro Ala Leu Tyr 

1449 TIG TGG CGT AAA AAT GCC CCC GTG GAC ACC 
44^Leu Trp A rg Lys Asn Ala Pro Val Asp Thr 

1479 ACA AAG ACA AAA ACG CTC GAT TTA GGA ACT 
54^Thr Lys Thr Lys Thr Leu Asp Leu Gly Thr 

1509 GCT TIC CAC TGC CGG GTA CTT GAA CCG GAA 
64^Ala Phe His Cys Arg Val Leu Gl u Pro Glu 
EcoRI 

1539 GAA TTC AGT AAC CGC TTT ATC GTA GCA CCT 
74^Glu Phe Ser Asn Arg Phe Me Val Ala Pro 



1569 GAA TTT AAC 
84^Glu Phe Asn 

1599 GAA GAA GAG 
94^ Glu Glu Glu 

1629 GCA AGC ACA 

104^Ala Ser Thr 



CGC 


CGT 


ACA 


AAC 


A rg 


Arg 


Thr 


Asn 


AAA 


GCG 


TTT 


CTG 


Lys 


Ala 


Phe 


Leu 


GGA 


AAA 


ACG 


GTT 


Gly 


Lys 


Thr 


Val 



GCC GGA AAA 

Ala Gly Lys 

ATG GAA TGC 

Met Gl u Cys 

ATC ACT GCG 

Me Thr Ala 
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Figure 13b (cont'd) 



1659 GAA 


GAA 


GGC 


CGG 


AAA 


ATT 


GAA 


CTC 


ATG 


TAT 






Gly Arg 


Ly s 


i 1 e 


Val U 


Leu 


ivie I 


1 yr 


1689 CAA 


AGC 


GTT 


ATG 


GCT 


TTG 


CCG 


CTG 


GGG 


CAA 


124 ► Gl n 


WW 1 


Val 


Met 




1 All 


P rn 


1 All 

Leu 


\M y 


vai n 


1719 TGG 


CTT 


GTT 


GAA 


AGC 


GCC 


GGA 


CAC 


GCT 


GAA 


1 "XA^ T rn 


1 All 

Leu 


VCI 1 


Gl II 


Cor 

oer 


Ai a 


val y 


LIS o 

nl S 


A 1 a 


^1 1 1 

ol U 


1749 TCA 


TCA 


ATT 


TAC 


TGG 


GAA 


GAT 


CCT 


GAA 


ACA 


1 / it k Oa r 

±44^ o6r 


oGr 


i i e 


1 y 1 


Trp 


Glu 


Asp 


P ro 


Gl U 


Thr 


1779 GGA 


ATT 


TTG 


TGT 


CGG 


TGC 


CGT 


CCG 


GAC 


AAA 


id4^ \ja y 


1 1 e 


Leu 


Cys 


Arg 


Cys 


A rg 


P ro 


Asp 


Lys 


1809 ATT 


ATC 


CCT 


GAA 


TTT 


CAC 


TGG 


ATC 


ATG 


GAC 


±04^ 1 1 6 


1 1 e 


Pro 


Glu 


Phe 


Hi s 


Trp 


1 1 e 


Me I 


Asp 


1839 GTG 


AAA 


ACT 


ACG 


GCG 


GAT 


ATT 


CAA 


CGA 


TTC 


± va 1 


Lys 


Thr 


Thr 


Al a Asp 


i 1 e 


Gl n 


A rg 


rrie 


1869 AAA 


ACC 


GCT 


TAT 


TAC 


GAC 


TAC 


CGC 


TAT 


CAC 


±o4^ Lys 


1 nr 


Ala 


Tyr 


Tyr Asp 


Tyr 


A rg 


Tyr 


nl S 


1899 GTT 


CAG 


GAT 


GCA 


TTC 


TAC 


AGT 


GAC 


GGT 


TAT 


±y4 ~ va 1 


(al n 


Asp Al a 


Phe 


Tyr 


ser 


ASp 


oi y 


Tyr 


1929 GAA 


OCA 


CAG 


TTT 


GGA 


GTG 


CAG 


CCA 


ACT 


TTC 


204^ Gi u 


Al a 


Gin 


Phe 


Gly 


Val 


Gl n 


Pro 


Thr 


Phe 


1959 GTT 


TTT 


CTG 


GTT 


GCC 


AGC 


ACA 


ACT 


ATT 


GAA 


214^Val 


Phe 


Leu 


Val 


Ala 


Ser 


Thr 


Thr 


lie 


Gl u 


1989 TGC 


GGA 


CGT 


TAT 


CCG 


GTT 


GAA 


ATT 


TTC 


ATG 


224 ►Cys 


Gly 


A rg 


Tyr 


Pro 


Val 


Glu 


1 le 


Phe 


Met 


2019 ATG 


GGC 


GAA 


GAA 


GCA 


AAA 


CTG 


GCA 


GGT 


CAA 


234^Met 


Giy 


Glu 


Glu 


Ala 


Lys 


Leu 


Ala 


Gly 


Gl n 
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Figure 1 3b (cont'd) 

2049 CAG GAA TAT CAC CGC AAT CTG CGk ACC CIG 
244^ Gin Glu Tyr His Arg Asn Leu A rg Thr Leu 

2079 TCT GAC TGC CTG AAT ACC GAT GAA TGG CCA 
254^Ser Asp Cys Leu Asn Thr Asp Glu Trp Pro 

2109 GCT ATT AAG ACA TTA TCA CTG CCC CGC TGG 
264^Ala Me Lys Thr Leu Ser Leu Pro Arg Trp 

Xhol Kpnl 

2139 GCT AAG GAA TAT GCA AAT GAC TAGATCTCGAG 
274^Ala Lys Glu Tyr Ala Asn Asp 

2171 GTACCCGAGCACGTGTTGACAATTAATCATCGGCATAGT 

2210 ATATCGGCATAGTATAATACGACAAGG1GAGGAACTAAA 
Ncol 

2249 CC ATG GCT AAG CAA CCA CCA ATC GCA AAA 
l^Met Al a Lys Gl n Pro Pro lie Ala Lys 



2278 GCC 


GAT 


CTG 


CAA 


AAA 


ACT 


CAG 


GGA 


AAC 


CGT 


lO^AIa 


Asp 


Leu 


Gin 


Lys 


Thr 


Gl n 


Gly 


Asn 


A rg 


2308 GCA 


CCA 


GCA 


GCA 


GTT 


AAA 


AAT 


AGC 


GAC 


GTG 


20^Ala 


Pro 


Ala 


Ala 


Val 


Lys 


Asn 


Ser 


Asp 


Val 


2338 ATT 


AGT 


TTT 


ATT 


AAC 


CAG 


CCA 


TCA 


ATG 


AAA 


30^ Me 


Ser 


Phe 


lie 


Asn 


Gl n 


Pro 


Ser 


Met 


Lys 


2368 GAG 


CAA 


CTG 


GCA 


GCA 


GCT 


CTT 


CCA 


CGC 


CAT 


40^Glu 


Gl n 


Leu 


Ala 


Ala 


Al a 


Leu 


Pro A rg 


HI s 


2398 ATG 


ACG 


GCT GAA 


CGT 


ATG 


ATC 


CGT 


ATC 


GCC 


SO^Met 


Thr 


Ala 


Glu 


Arg 


Met 


lie A rg 


1 le 


Al a 


2428 ACC 


ACA 


GAA 


ATT 


CGT 


AAA 


GTT 


CCG 


GCG 


TTA 


60^Thr 


Thr 


Gl u 


Me 


Arg 


Lys 


Val 


Pro 


Al a 


Leu 
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Figure 13b (cont'd) 



2458 GGA 


AAC 


TGT 


GAC 


ACT 


ATG 


AGT 


TTP 


GTC 


AGT 


70^Gly 


Asn 


Cys Asp 


Thr 


iVIet 


Ser 


Phe 


Val 


Ser 


2488 GCG 


ATC 


GTA 


CAG 


TGT 


TCA 


CAG 


CTC 


GGA 


CTT 


SO^AI a 


1 le 


Val 


Gl n 


Cys 


Ser 


Gl n 


Leu 


Gly 


Leu 


2518 GAG 


CCA 


GGT 


AGC 


GCC 


CTC 


GGT 


CAT 


GCA 


TAT 


90^ Gl u 


Pro 


Giy 


Ser 


Ala 


Leu 


Gly 


His 


Ala 


Tyr 


2548 TTA 


CTG 


CCT 


TTT 


GGT 


AAT 


AAA 


AAC 


GAA 


AAG 


100^ Leu 


Leu 


Pro 


Pile 


Gi y Asn 


Lys Asn 


Giu 


Lys 


2578 AGC 


GGT 


AAA 


AAG 


AAC 


GTT 


CAG 


CTA 


ATC 


ATT 


110 ►Ser 


Gly 


Lys 


Lys 


Asn 


Val 


Gl n 


Leu 


1 le 


1 le 


2608 GGC 


TAT 


CGC 


GGC 


ATG 


ATT 


GAT 


CTG 


GCT 


CGC 


120^Gly 


Tyr 


Arg 


Gly 


Met 


lie 


Asp 


Leu Ala A rg 


2638 CGT 


TCT 


GGT 


CAA 


ATC 


GCC 


AGC 


CTG 


TCA 


GCC 


130^ A rg 


Ser 


Gly 


Gin 


1 le 


Ala 


Ser 


Leu 


Ser 


Ai a 


2668 CGT 


GTT 


GTC 


CGT 


GAA 


GGT 


GAC 


GAG 


TTT 


AGC 


140^ A rg 


Val 


Val 


Arg 


Gl u 


Gly 


Asp 


Gl u 


Phe 


Ser 


2698 TTC 


GAA 


TTT 


GGC 


CTT 


GAT 


GAA 


AAG 


TTA 


ATA 


ISO^Phe 


Gi u 


Phe 


Gly 


Leu Asp 


Giu 


Lys 


Leu 


1 1 e 


2728 CAC 


CGC 


CCG 


GGA 


GAA 


AAC 


GAA 


GAT 


GCC 


CCG 


160^ Hi s 


Arg 


Pro 


Gly 


Gl u 


Asn 


Gl u Asp Al a 


Pro 


2758 GTT 


ACC 


CAC 


GTC 


TAT 


GCT 


GTC 


GCA 


AGA 


CTG 


170^ Val 


Thr 


His 


Val 


Tyr Ala 


Vai 


Ala A rg 


Leu 


2788 AAA 


GAC 


GGA 


GGT 


ACT 


CAG 


TIT 


GAA 


GTT 


ATG 


180^ Lys 


Asp 


Gly 


Gly 


Thr 


Gl n 


Phe 


Gl u 


Val 


Met 


2818 ACG 


CGC 


AAA 


CAG 


ATT 


GAG 


CTG 


GTG 


CGC 


AGC 


190 ►Thr 


Arg 


Lys 


Gin 


1 ie 


Gi u 


Leu 


Val 


A rg 


Ser 
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Figure 13b (cont'd) 



2848 CTG 


AGT 


AAA GCT 


wwx 




AAP 




CCG 


TGG 


200^ Leu 


Ser 


Lys Ala 


Gly 

J 


Asn 


Asn 


Gl y 


Pro Trp 


2878 GTA 


APT 


PAP Tnn, 










AAG 


AAA 


210^Val 


Thr 


His Trp 


Gl u 


Gl u 


Met 


Ai a 


Lys 


Lys 


2908 ACG 


GCT 


AX X ^vjrX 






ITTTV-l 

1 1 


AAA 
AAA 


TAT 
x^^x 


ttt; 

X XVJ 


220^Thr 


Al a 


Me A rg 


A rg 


Leu 


Phe 


Lvs 


Tvr 


Leu 

hm W U 


2938 CCC 


GTA 


X>w«ri rlX X 










GCA 


GTA 


230^ Pro 


Val 


Ser Me 


Gl u 


1 1 e 


Gl n 


A rn 
r\ 1 y 


Ala 


Val 


2968 TCA 


ATG 


GAT GAA 


AAG 


GAA 


CCA 


CTG 


ACA 


ATC 


240^Ser 


Met 


Asp Gl u 


Lys 


Gl u 
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Leu 


Thr 


1 le 


2998 GAT 


CCT 


GCA GAT 


TCC 


TCT 


GTA 


TTA 


ACC 


GGG 


250^Asp 


Pro 


Al a Asp 


Ser 


Ser 


Val 


Leu 


Thr 


Gly 


3028 GAA 


TAG 


AGT GTA 


ATC 


GAT 


AAT 


TCA 


GAG 


GAA 


260^Glu 


Tyr 


Ser Val 


Me 


Asp Asn 


Ser 


Glu 


Gl u 


Bglil 


Hindlll 















3058 TAG ATCTAAGCTTCCTGCTGAACATCAAAGGCAAGAAA 
270^« • • 



3096 ACATCTGTTGTCAAAGACAGCATCCTTGAACAAGGACAA 

3135 TTAACAGTTAACAAATAAAAACGCAAAAGAAAATGCCGA 

3174 TATCCTATTGGCATTTTCTTTTATTTCTTATCAACATAA 

Xhol 

3213 AGGTGAATCCCATACCTCGAGCTTCACGCTGCCGCAAGC 

3252 ACTCAGGGCGCAAGGGCTGCTAAAAGGAAGCGGAACACG 

3291 TAGAAAGCCAGTCCGCAGAAACGGTGCTGACCCCGGATG 

3330 AATGTCAGCTACTGGGCTATCTGGACAAGGGAAAACGCA 

3369 AGCGCAAAGAGAAAGCAGGTAGCTTGCAGTGGGCTTACA 
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Figure 1 3b (cont'd) 

3408 TGGCGATAGCTAGACTGGGCGGTTTTATGGACAGCAAGC 

3447 GAACCGGAATTGCCAGCTGGGGCGCCCTCTGGTAAGGTT 

3486 GGGAAGCCCTGCAAAGTAAACTGGATGGCTTTCTTGCCG 

Bglll 

3525 CCAAGGATCTGATGGCGCAGGGGATCAAGATCTGATCAA 



3564 GAGACAGGATGAGGATCGTTTCGC ATG GAT ATT 

l^lViet Asp I le 



oby / AAT 
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GAA 


TV / VP 

ACT 


GAG 
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CAT 


A ^ A c n 


Th r 
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TAT 
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AGT 


TCA 
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AAC 
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34^Ser 
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Ser 


Ser 


Al a 


Met 


Asn 


Al a 


3717 TAT 


TAG 


ATT 


CAG 


GAT 


CGT 


CTT 
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CAG 


44^Tyr 


Tyr 


i ie 


Gin 


Asp 


A rg 


Leu 


Gl u 


Al a 


Gin 


3747 AGC 


TGG 


GCG 


CGT 
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TAC 


CAG 


CAG 


CTC 


GCC 


54^Ser 


Trp 


Ala 


Arg 


His 


Tyr 


Gl n 


Gin 


Leu 


Ala 


3777 CGT 


GAA 


GAG 


AAA 


GAG 


GCA 


GAA 


CTG 
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GAC 


64^Arg 


Glu 


Gl u 


Lys 


Glu 


Al a 


Gl u 


Leu 


Al a Asp 


3807 GAC 


ATG 


GAA 
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GGC 


CTG 
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CAG 


CAC 


CTG 


74^Asp 


Met 
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Lys 


Gly 
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3837 TTT 


GAA 
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ATC 


GAT 


CAT 


TTG 


CAA 


84^Phe 


Glu 


Ser 


Leu 


Cys 


i 1 e Asp 


His 


Leu 


Gl n 


3867 CGC 


CAC 


GGG 


GCC 


AGC 


AAA 


AAA 


TCC 


ATT 


ACC 


94>Arg 


His 


Gly 


Ala 


Ser 


Lys 


Lys 


Ser 


1 le 


Thr 
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Figure 1 3b (cont'd) 

3897 CX3T GCG TTT GAT GAC GAT GTT GAG TIT GAG 
104^Arg Ala Phe Asp Asp Asp Val Gl u Phe Gin 

3927 GAG CGC ATG GCA GAA CAC ATC CGG TAG ATG 
114^Glu Arg Met Ala Glu His lie Arg Tyr Met 

3957 GTT GAA ACC ATT GOT CAC CAC CAG GTT GAT 
124^ Val Glu Thr 1 1 e Al a HI s Hi s Gl n Val Asp 

Hindlll 

3987 ATT GAT TCA GAG GTA TAA AACGAGTAGA AGCT 
134^lle Asp Ser Glu Val ••• 

4019 TGGCTGTTTTGGCGGATGAGAGAAGATITICAGCCTGAT 

4058 ACAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAACA 

4097 GAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA 

4136 CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGA 

4175 TGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTG 

4214 CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACT 

4253 GGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAACGCTC 

4292 TCCTGAGTAGGACAAATCCGCaSGGAGOSGATTTGAACG 

4331 TTGCGAAGCAACGGCCCGGAGGGTGGCGGGCAGGACGCC 

4370 CGCCATAAACTGCCAC3GCATCAAATTAAGCAGAAGGCCA 

4409 TCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTT 

4448 TGTTTATTTTTCTAAATACATTCAAATATGTATCCGCTC 

4487 ATGAGACAATAACCCTGATAAATGCTTCAATAATATTGA 

4526 AAAAGGAAGAGT ATG AGT ATT CAA CAT TTC 

l^Met Ser I I e Gl n Hi s Phe 
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Figure 1 3b (cont'd) 

4556 CGT GTC GCC CTT AIT CCC TTT TTT GCG GCA 
7^Arg Val Ala Leu lie Pro Phe Phe Ala Ala 

4586 TTT TGK: CTT CCr GTT TTT GCT CAC CCA GAA 
IV^Phe Cys Leu Pro Val Phe Ala His Pro G! u 

4616 ACQ CTG GTG AAA GTA AAA GAT GCT GAA GAT 
27^Thr Leu Val Lys Val Lys Asp Ala Gl u Asp 

4646 CAG TTG GGT GCA CGA GTG GGT TAC ATC GAA 
37^Gln Leu Gl y Ala A rg Val Gl y Tyr He Gl u 

4676 CTG GAT CTC AAC AGC GGT AAG ATC CTT GAG 
47^Leu Asp Leu Asn Ser Gl y Lys lie Leu Gl u 

4706 AGT TTT CGC CCC GAA GAA CGT TTT CCA ATC 
57^Ser Phe A rg Pro Gl u Gl u A rg Phe Pro Met 

4736 ATC AGC ACT TTT AAA GTT CTC CTA TCT GGC 
67^Met Ser Thr Phe Lys Val Leu Leu Cys Gl y 

4766 GCG GTA TTA TCC CGT GTT GAC GCC GGG CAA 
77^Ala Val Leu Ser A rg Val Asp Ala Gl y Gin 

4796 GAG CAA CTC. GGT CGC CGC ATA CAC TAT TCT 
87>Glu Gin Leu Gly Arg Arg I le His Tyr Ser 

Seal 

4826 CAG AAT GAC TTC GTT GAG TAC TCA CCA GTC 
97^Gln Asn Asp Leu Val Gl u Tyr Ser Pro Val 

4856 ACA GAA AAG CAT CTT ACG GAT GGC ATC ACA 
107^ Thr Glu Lys His Leu Thr Asp Gly Met Thr 

4886 GTA AGA GAA TTA TGC AGT GCT GCC ATA ACC 
117^Val Arg Glu Leu Cys Ser Ala Ala lie Thr 

4916 ATC AGT GAT AAC ACT GCG GCC AAC TTA CTT 
127^Met Ser Asp Asn Thr Ala Ala Asn Leu Leu 
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Figure 13b (cont'd) 
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A AT 
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/^TV O 

GAC 


GAG 


CGT 


167 ►Leu 


Asn 


Gl u 


Ala 
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177^Asp 
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ROQfi Apr; 
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AAA 




X xA 


TV 




TV 
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LIA 


187 ►Thr 
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A rg 


Lys 


Leu 


Leu 


Thr 
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VJI u 


1 oil 




APT 

n^X 


CTA 


GCT 






OTV TV 

LAA 


LAA 


TTA 


ATA 


197^ Leu 


Thr 


Leu 


Ala 


Ser 


A ra 


Gl n 


Gl n 


Leu 


i le 


^ X «^ O VjnV^ 


Xvjvj 


ATG 


GAG 






TV "A TV 

AAA 


GTT 


GCA 


GGA 


207^Asp 


T rp 


Met 


Gl u 


Al a 


ASD 


Lv s 


Val 


Al a 


Gl y 




VwX 1 


CTG. CGC 


X\J<3 
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CTT 


CCG 
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GGC 


217 ►Pro 


Leu 


Leu Arg 


Ser 


Al a 


Leu 


Pro 


Ala 


Gly 




rrrTTTi 
XXX 


ATT 


GCT 


\j£\X 


TV TV ^ 

AAA 


TCT 


GGA 


GCC 


GGT 


227^ TfD 


Php 


i 1 e 


Ala 




Lys 


Qo r 

os r 


Val y 


Ala 


Gly 


5246 GAG 


CGT 


GGG 


TCT 


CGC 


GGT 


ATC 


ATP 


GCA 


GCA 


237^Glu 


A rg 


Gly 


Ser 


Arg 


Gly 


1 1 e 


1 le 


Ala 


Ala 


5276 CTG 


GGG 


CCA 


GAT 


GGT 


AAG 
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TCC 


CGT 


ATC 


247^ Leu 


Gl y 


Pro Asp 


Gl y 


Lys 


Pro 


Ser 


A rg 


1 le 


5306 GTA 


GTT 


ATC 


TAC 


ACG 


ACG 


GGG 


AGT 


CAG 


GCA 


257^ Val 


Val 


1 le 
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Thr 


Gl y 


Ser 


Gin 


Ala 
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Figure 13b (cont'd) 

5336 ACT ATG GAT GAA CGA AAT AGA GAG ATC GCT 
267^Thr Met Asp Gl u Arg Asn Arg Gin Me Ala 

5366 GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG 
277^Glu Me Gl y Ala Ser Leu Me Lys His Trp 

5396 TAA CTGTCAGACCAAGTTTACTCATATATACTTTAGAT 

5434 TGATTTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGG 
5473 GTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCA 
5512 GCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCT 
5551 TTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAA 
5590 ATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTAC 
5629 GGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGTT 
5668 CACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCC 
5707 CTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCT 
5746 TGTTCCAAACTTGAACAACACTCAACCCTATCTCGGGCT 
5785 ATTCTTTTGATlTATAAGGGATTTTGCCGATTTCGGCCT 
5824 ATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACG 
5863 CGAATTTTAACAAAATATTAACGTTTACAATTTAAAAGG 
5902 ATCTAGGTGAAGATCCTITrTGATAATCTCATGACCAAA 
5941 ATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGAC 
5980 CCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTT 
6019 TTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCA 
6058 CCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTA 
6097 CCAACTCTTITrcCGAAGGTAACTGGCTTCAGCAGAGCG 
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Figure 1 3b (cont'd) 

6136 CAGATACCAAATACTGTCCTTCTAGTGTAGCCGTAGTTA 

6175 GGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATAC 

6214 CTCGCTCTOCTAATCCTGTTACCAGTGGCTGCTGCCAGT 

6253 GGCGATAAGTCGTGTCITACCGGGTTGGACTCAA^^ 

6292 TAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAACGGGG 

6331 GGTTCGTGCACACAGCCCAGCTTiSGAGCGAACGACCTAC 

6370 ACCGAACTGAGATACCTACAGCGTGAGCTATGAGAAAGC 

6409 GCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTATCCG 

6448 GTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGAG 

6487 CTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTC 

6526 GGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTTGTGA 

6565 TGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGCCAGC 

6604 AACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTGGCCT 

6643 TTTGCTCACAIGTTCTTTCCTGCGTTATC 

6682 GTGGATAACCGTATTACCX3CCTTTGAGTGAGCTGATACC 

6721 GCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTG 

6760 AGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTC 

6799 CTTACGCATCTGTGCGGTATTTCACACCGCATAGGGTCA 

6838 TGGCTGCGCCCCGACACCCGCCAACACCCGCTGACGCGC 

6877 CCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAGAC 

6916 AAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGT 

6955 TTTCACCGTCATCACCGAAACGCGCGAGGCAGCAAGGAG 
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Figure 1 3b (cont'd) 

6994 ATGGCGCCCAACAGTCCCCCGGCCACGGGGCCTCCCACC 
7033 ATACCCACGKrCGAAACy^GCGCTCATGAGCCCGAAG'TCG 
7072 CGAGCCCGATCTTCCCCATCGGTGATGTCGGCGATATAG 
7111 GCGCCAGCAACCGCACCTGTGGCGCCGGTGATGCCGGCC 
7150 ACGATGCGTCCGGCGTAGAGGATCTGCIH^TGTTTGACA 
7189 GCTTATC 
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Figure 14 a 



EcoRV 



Seal 
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Figure 14b 

Nsil 

1 ATCGATGCATAATGTGKTCTGTCAAATGGACGAAGCAGGG 
40 ATTCTGCAAACCCTATGCTACTCCGTCT^GCCGTCAATT 
79 GTCTGATTCGTTACCAA TTA TGA CAA CTT GAC 
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Figure 14b (cont'd) 
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Trp 


Tyr Ala 


Arg 


Pro A rg 


681 AAA 


GTA 


AAC 


CCA 


CTG 


GTG 


ATA 


CCA 


TTC 


GCG 


98^Phe 


Tyr 


Val 


Trp 


Gl n 


His 


Tyr 


Trp 


Gl u A rg 


711 AGC 


CTC 


CGG 


ATG 


ACG 


ACC 


GTA 


GTG 


ATG 


AAT 


88^Ala 


Gl u 


Pro 


His 


Arg 


Gly 


Tyr 


HI s 


Hi s 


1 1 e 


741 CTC 


TCC 


TGG 


CGG 


GAA 


CAG 


CAA 


AAT 


ATC 


ACC 


78^Glu 


Gly 


Pro 


Pro 


Phe 


Leu 


Leu 


1 le 


Asp 


Gly 


771 CGG 


TCG 


GCA 


AAC 


AAA 


TTC 


TCG 


TCC 


CTG 


ATT 


68^ Pro 


Arg 


Cys 


Val 


Phe 


Gi u 


Arg 


Gly 


Gl n 


Asn 



SUBSITTUTE SHEET (RULE 26) 



wo 99/29837 PCT/EP98W945 



54/65 



Figure 14b (cont'd) 
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8^ Pro 
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982 ATACTCCCGCCATTCAGAGAAGAAACCAATTGTCCATAT 



1021 TGCATCAGACATTGCCGTCACTGCGTCTTTTACTGGCTC 

1060 TTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGC 

1099 ATTCTGTAACAAAGCGGGACCAAAGCCATGACAAAAACG 

1138 CGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCAC 

1177 ATTGATTATTTGCACGGCGTCACACTTTGCTATGCCATA 

BamHI 

1216 GCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGC 

1255 TTTTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTT 

Nhel EcoRI 
1294 TTTTGGGCTAGCAGGAGGAATTCACC ATG ACA CCG 

l^Met Thr Pro 

PstI 

1329 GAC ATT ATC CTG CAG CGT ACC GGG ATC GAT 



SUBSTTTUTE SHOT (RULE 26) 




wo 99/29837 PCT/EP98rt)7945 



55/65 



Figure 14b (cont'd) 
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Lys 


Trp 


Pro Asp Met 


1479 AAA 


ATG 


TCC 


TAG 


TTC 


CAC 


ACC 


CTG 


CTT 


GCT 


54^ Lys 


Met 


Ser 


Tyr 


Phe 


Hi s 


Thr 


Leu 


Leu 


Ala 


1509 GAG 


GTT 


TGC 


ACC 


GGT 


GTG 


GCT 


CCG 


GAA 


GTT 


64^ Gl u 


Val 


Cys 


Thr 


Gly 


Vai 


Ala 


Pro 


Glu 


Val 


1539 AAC 


GCT 


AAA 


GCA 


CTG 


GCC 


TGG 


GGA 


AAA 


CAG 


74^ Asn 


Ai a 


Lys 


AI a 


Leu 


Ai a 


Trp 


Gl y 


Lys 


Gi n 




















EcoRI 


1569 TAG 


GAG 


AAC 


GAG 


GCC 


AGA 


ACC 


CTG 


TTT 


GAA 


84>Tyr 


Gi u 


Asn 


Asp 


Ala A rg 


Thr 


Leu 


Phe 


Gl u 




ACT 


TCC 




GTG 


AAT 


GTT 


ACT 


GAA 


TCC 


94^Phe 


Thr 


Ser 


Gly 


Val 


Asn 


Val 


Thr 


Gl u 


Ser 


1629 CCG 


ATC 


ATC 


TAT 


CGC 


GAG 


GAA 


AGT 


ATG 


GGT 


104^ Pro 


1 ie 


1 Ie 


Tyr 


A rg Asp 


Gl u 


Ser 


Met 


A rg 


1659 ACC 


GCC 


TGC 


TCT 


CCC 


GAT 


GGT 


TTA 


TGC 


AGT 


114 ►Thr 


AI a 


Cys 


Ser 


Pro Asp 


Gl y 


Leu 


Cys 


Ser 


1689 GAG 


GGC 


AAC 


GGC 


CTT 


GAA 


CTG 


AAA 


TGC 


CCG 


124^Asp 


Gl y Asn 


Gly 


Leu 


Gi u 


Leu 


Lys 


Cys 


Pro 
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Figure 14b (cont'd) 



1719 TTT 


ACQ 


TCG 


GGG 


GAT 


TTC 


ATG 


AAG 


TTC 


GGG 


134^Phe 


Thr 


Ser 


Arg 


Asp 


Phe 


Met 


Lys 


Phe Arg 


1749 CTC 


GOT 


GGT 


TTC 


GAG 


GGG 


ATA 


AAG 


TCA 


GGT 


144 ►Leu 


Gly 


Gly 


Phe 


Glu 


Ala 


1 le 


Lys 


Ser 


Al a 


1779 TAG 


ATG 


GGG 


GAG 


GTG 


GAG 


TAG 


AGG 


ATG 


TGG 


154^ Tyr 


Met 


Ai a 


Gi n 


Val 


Gl n 


Tyr 


Ser 


Met 


Trp 


1809 GTG 


ACG 


CGA 


AAA 


AAT 


GGG 


TGG 


TAG 


TTT 


GGG 


164^ Val 


Thr 


A rg 


Lys 


Asn 


Al a 


Trp 


Tyr 


Phe 


Al a 


1839 AAC 


TAT 


GAG 


GGG 


GGT 


ATG 


AAG 


GGT 


GAA 


GGG 


174^Asn 


Tyr Asp 


Pro 


A rg Met 


Lys 


A rg 


Glu 


Gly 




CAT 


TAT 




GTG 


ATT 






GAT 


GAA 


1 RA^ 1 Pii 


His 


Tyr 


Vet 1 


Val 


1 le 




A rn 


Asp 


Gl u 


1899 AAG 


TAG 


ATG 


GGG 


AGT 


•riT 


GAG 


GAG 


ATG 


GTG 


194 ►Lys 


Tyr Met 


AI a 


Ser 


Phe 


Asp 


Glu 


1 le 


Val 


1929 CCG 


GAG 


TTC 


ATC 


GAA 


AAA 


ATG 


GAG 


GAG 


GCA 


204^ Pro 


Glu 


Phe 


1 1 e 


Glu 


Lys 


Met 


Asp 


Glu 


Al a 


1959 CTG 


GGT 


GAA 


ATT 


GGT 


TTT 


GTA 


TTT 


GGG 


GAG 


214^ Leu 


Ala 


Gl u 


t 1 e 


Gly 


Phe 


Val 


Phe 


Gly 


Gl u 










Kpnl 











1989 GAA TGG GGA TAGATGCGGTAGCCGAGCAGGTGTTGA 
224> Gl n Trp A rg • * • 



2025 GAATTAATGATGGGGATAGTATATGGGGATAGTATAATA 

2064 GGAGAAGGTGAGGAAGTAAAGC ATG AGT AGT GGA 

l^Met Ser Thr Ala 

2098 GTG GGA ACG CTG GGT GGG AAG CTG GGT GAA 
5^Leu Ala Thr Leu Ala Gly Lys Leu Ala Glu 
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Figure 1 4b (cont'd) 



2128 CGT 


GTC 


GGC 


ATG 


A ra 


VCI 1 


\M y 


IVK? I 


2158 GAA 


CTG 


ATC 


ACC 


25^ Gl u 


L P ti 


1 1 w 


Thr 


2188 TTT 


^ ^ ^ 

AAA 


GGT 


GAT 


35^ Phe 




vai y 


A c n 


2218 ATC 


^^^^^ 
GCA 


TTA 


CTG 


45^ 1 1 e 


A! a 


1 Ptj 

L. w U 


1 Oil 


2248 GGC 


CTT 


AAT 


CGG 


55 ► Gl V 


1 All 
L» w U 


A Q n 


P rn 


2278 GCC 


'i'i'l' 


CCT 


GAT 


65^ Al a 


Phe 


P ro 


A ^ n 


2308 CCG 


GTG 


GTG 


GGC 


75^ Pro 


Val 


Val 


Gl V 
V3I y 


2338 ATC 


ATC 


AAT 


GAA 


i 1 p 

^ lie? 


1 le 


Asn 




2368 ATG 


GAG 


TTT 


GAG 


95^Mpt 


Asp 


Phe 


r^i II 

\3I u 


2398 ACA 


TGC 


CGG 


ATT 


105 ► Thr 


Cys Arg 


1 1 e 


2428 CAT 


CCG 


ATC 


TGC 


115 ► HI s 


Pro 


1 le 


Cys 


2458 GAA 


TGC 


CGC 


CGC 


125^Glu 


Cys A rg 


A rg 


2488 GAA 


GGC 


AGA 


GAA 


135 ►Gl u 


Gly Arg 


Gl u 







Sail 








GAT 


TCT 


GTC 


GAC 


CCA 


CAG 


Asp 


Ser 


Val 


Asp 


Pro 


Gi n 


ACT 


CTT 


CGC 


CAG 


ACG 


GCA 


Thr 


Leu Arg 


Gl n 


Thr 


Aia 


GCC 






GCG 


CAG 


TTC 


Al a 


Ser 


Asp 


Ala 


Gin 


Phe 


ATC 


fy ftp 




AAC 


CAG 


TAC 


1 1 e 


Val 


Ai a 


Asn 


Gin 


Tyr 


TGG 


ACG 


AAA 


GAA 


ATT 


TAC 


Trp 


Thr 


Lys 


G! u 


1 i e 


Tyr 


AAG 


CAG 


AAT 


GGC 


ATC 


GTT 


Lys 


Gl n 


Asn 


Gly 


1 1 e 


Val 


GTT 


GAT 


GGC 


TGG 


TCC 


CGC 


Val 


Asp 


Gly 


Trp 


Ser 


A rg 


AAC 


CAG 


CAG 


'i'iT 


GAT 


GGC 


Asn 


Gl n 


Gl n 


Phe 


Asp 


Gly 


GAG 


GAC 


AAT 


GAA 


TCC 


TGT 


Gl n 


Asp Asn 


Gl u 


Ser 


Cys 


TAC 


CGC 


AAG 


GAC 


CGT 


AAT 


Tyr 


A rg 


Lys 


Asp 


A rg Asn 


GTT 


ACC 


GAA 


TGG 


ATG 


GAT 


Val 


Thr 


Gl u 


Trp 


iViet 


Asp 


GAA 


CCA 


TTC 


AAA 


ACT 


CGC 


Gi u 


Pro 


Phe 


Lys 


Thr 


A rg 


ATC 


ACG 


GGG 


CCG 


TGG 


CAG 


1 le 


Thr 


Gl y 


Pro 


Trp 


Gin 
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Figure 14b (cont'd) 



2518 TCG 


CAT 


CCC 


AAA 


CGG 


ATG 


TIA 


CGT 


CAT 


AAA 


145^Ser 


His 


Pro 


Lys 


Arg Met 


Leu 


Arg 


His 


Lys 


2548 GCC 


ATG 


ATT 


CAG 


TGT 


GCC 


CGT 


CTG 


GCC 


TTC 


155^Ala 


Met 


1 1 e 


Gl n 


Cys Ala Arg 


1 Pii 

w LI 


Ala 


Phe 


2578 GGA 


TTP 


GCT 


GGT 


ATC 


TAT 


GAC 


AAG 


GAT 


GAA 


165^Gly 


Phe 


Al a 


Gl V 


Me 


Tyr Asp 


Lv s 

mm V W 


Asp 


Glu 


2608 GCC 


GAG 


CGC 


ATT 


GTC 


GAA 


AAT 


li /'VI 1 

ACT 


GCA 


TAC 


175^Ala 


Gl u A rg 


1 1 e 


Val 


Glu 


Asn 


Thr 


Ala Tyr 


PstI 


















2638 ACT 


GCA 


GAA 


CGT 


CAG 


CCG 


GAA 


CGC 


GAC 


ATC 


185^Thr 


Ala 


Gl u 


A rg 


Gin 


Pro 


Gl u 


A rg 


Asp 


1 1 e 


2668 ACT 


CCG 


GTT 


AAC 


GAT 


GAA 


ACC 


ATG 


CAG 


GAG 


195 ►Thr 


Pro 


Val 


Asn 


Asp 


Glu 


Thr 


Met 


Gin 


Gi u 


2698 ATT 


AAC 


ACT 


CTG 


CTG 


ATC 


GCC 


CTG 


GAT 


AAA 


205^ 1 le 


Asn 


Thr 


Leu 


Leu 


1 le 


Ai a 


Leu 


Asp 


Lys 


2728 ACA 


TGG 


GAT 


GAC 


GAC 


TTA 


TTG 


CCG 


CTC 


TGT 


215 ►Thr 


Trp Asp 


Asp 


Asp 


Leu 


Leu 


Pro 


Leu 


Cys 


2758 TCC 


CAG 


ATA 


TTT. 


CGC 


CGC 


GAC 


ATT 


CGT 


GCA 


225^Ser 


Gin 


1 le 


Phe 


A rg A rg Asp 


Me 


A rg Ala 


2788 TCG 


TCA 


GAA 


CTG 


ACA 


CAG 


GCC 


GAA 


GCA 


GTA 


235 ►Ser 


Ser 


Gl u 


Leu 


Thr 


Gl n 


Al a 


Gl u 


Ala 


Val 


2818 AAA 


GCT 


CTT 


GGA 


TTC 


CTG 


AAA 


CAG 


AAA 


GCC 


245 ►Lys 


Ala 


Leu 


Gly 


Phe 


Leu. 


Lys 


Gl n 


Lys Al a 
















Bglll Xhol 


2848 GCA 


GAG 


CAG 


AAG 


GTG 


GCA 


GCA 


TAGATCTCGAG 


255^Ala 


Gl U 


Gl n 


Lys 


Val 


Ala 


Ala 


• • • 







SUBSTTTUTE SHEET (RULE 26) 



wo 99/29837 PCT/EP98/07945 



59/65 

Figure 14b (cont'd) 

Hindiil 

2880 AAGCTTCCTGCTGAACATCAAAGGCAAGAAAACATCTGT 

2919 TGTCAAAGACAGCATCCTTGAAa^GGACAATTAAC^^ 

2958 TAACAAATAAAAACGCAAAAGAAAATGCCGATATCCTAT 

2997 TGGCATTTTCTTTTATTTCTTATC^ 

Xhol 

3036 CCCATACCTCGAGCITCACGCTGCCGCAAGCACTCAGGG 

3075 CGCAAGGGCTGCTAAAAGGAAGCGGAACACGTAGAAAGC 

3114 CAGTCCGCAGAAACGGTGCTGACCCCGGATGAATGTCAG 

3153 CTACTGGGCTATCTGGACAAGGGAAAACGCAAGCGCAAA 

3 192 GAGAAAGCAGGTAGCTTGCAGTGGGCTTACATGGCGATA 

3231 GCTAGACTGGGCGGTTTTATGGACAGCAAGCGAACCGGA 
Pvull 

3270 ATTGCCAGCTGGGGCGCCCTCTGGTAAGGTTGGGAAGCC 

3309 CTGCAAAGTAAACTGGATGGCTTTCTTGCCGCCAAGGAT 

Bglll 

3348 CTGATGGCGCAGGGGATCAAGATCTGATCAAGAGACAGG 

3387 ATGAGGATCGTTTCGC ATG GAT ATT AAT ACT 

l^Met Asp lie Asn Thr 

3418 GAA ACT GAG ATC AAG CAA AAG CAT TCA CTA 
6^Glu Thr Gl u Me Lys Gin Lys His Ser Leu 

3448 ACC CCC TTT CCT GTT TIC CTA ATC AGC CCG 
16^Thr Pro Phe Pro Val Phe Leu lie Ser Pro 

3478 GCA TTT CGC GGG CGA TAT TTT CAC AGC TAT 
26>Ala Phe A rg Gl y Arg Tyr Phe His Ser Tyr 

3508 TTC AGG AGT TCA GCC ATG AAC GCT TAT TAG 
36^Phe Arg Ser Ser Ala Met Asn Ala Tyr Tyr 
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Figure 14b (cont'd) 



3538 ATT 


CAG 


GAT 


CGT 


CTT 


GAG 


GCT 


CAG 


AGC 


TGG 


46^ 1 ie 


Ql n 


Asp Arg 


Leu 


Glu 


Ala 


Gin 


Ser 


Trp 


3568 GCG 

«^ ^^^^^^ 


CGT 


CAC 


TAC 


CAG 


CAG 


w X w 


www 


CGT 


GAA 


56^Ala 


A rg 


His 


Tyr 


GI n 


GI n 


Leu 


Al a 


A rg 


Glu 


3598 GAG 


AAA 


GAG 


GCA 


GAA 




GCA. 

wwf^ 


GAC 


GAC 


ATG 


66^ Glu 


Lys 


Glu 


Al a 


Glu 


Leu 


Ala 


Asp 


Asp Met 


3628 GAA 


AAA 


GGC 


CTG 


ccc 


CAG 


CAC 


CTG 


TTT 


GAA 


76^Glu 


Lys 


Gly 


Leu 


Pro 


Gin 


Hi s 


Leu 


Phe 


GI u 


3658 TCX5 


CTA 


TGC 


ATC 


GAT 


CAT 


TTG 


CAA 

WX^£^ 


CGC 


CAC 


86^Ser 


Leu 


Cys 


1 Ie 


Asp 


Hi s 


Leu 


GI n 


A rg 


His 


3688 GGG 


GCC 


AGC 


AAA 


AAA 


TCC 


ATT 


ACC 


CGT 


GCG 


96^Gly 


Al a 


Ser 


Lys 


Lvs 


Ser 


1 1 e 


Thr 


A rg Ala 


3718 TTT 


GAT 


GAC 


GAT 


GTP 


GAG 




CAG 


GAG 


CGC 


106 ►Phe 


Asp 


Asp Asp 


Val 


GI u 


Phe 


Gin 


GI u A rg 


3748 ATG 


GCA 


GAA 


CAC 


ATC 


CGG 


TAC 


ATG 


GTT 


GAA 


lie^Met 


Ala 


GI u 


Hi s 


1 ie 


Arg 


Tyr 


Met 


Val 


Glu 


3778 ACC 


ATT 


GCT 


CAC 


CAC 


CAG 


GTT 


GAT 


ATT 


GAT 


126^Thr 


1 ie 


Ala 


His 


His 


GI n 


Val 


Asp 


1 1 e Asp 














Hindlli 






3808 TCA 


GAG 


GTA 


TAA 


AACGAGTAGA AGC TTG GCT 


136^Ser 


Gi u 


Val 


• • • 














3839 GTT 


TTG 


GCG 


GAT 


GAG 


AGA 


AGA 


TTP 


TCA 


GCC 



3869 TGA TACAGATTAAATCAGAACGCAGAAGCGGTCTGATA 
3907 AAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCA 



3946 CCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGC 
3985 GCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGG 
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Figure 14b (cont'd) 

4024 i^CTGCCAGGCATCAAATA?^CGAAAGGCTCAGTCGAA 

4063 AGACTGGGCCTTTCGTTTTATCTGTTGTT^ 

4102 CGCTCTCCTGAGTAGGACAAATCCGCCGGGAGCGGATTT 

4141 GAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGGGCAGG 

4180 ACGCCCGCCATJ^CTGCCAGGCATCAAATTAAGCAGAA 

4219 GGCCATCCTCACGGATGGCCTTTTIGCGTTTCT^ 

4258 TCTTTTGTTTATTTTTCTAAATACATTCA^ 

4297 CGCTCATGAGACAATAACCCTGATAAATGCTTCAATAAT 

4336 ATTGAAAAAGGAAGAGT ATG AGT ATT CAA CAT 

l^Met Ser i le Gin His 



4368 TIC 


CGT 


GTC 


GCC 


CTT 


ATT 


CCC 


ITi' 


TTT 


GCG 


6^Phe 


Arg 


Val 


Ala 


Leu 


lie 


Pro 


Phe 


Phe 


Ala 


4398 GCA 


'i'i'i' 


TGC 


CTT 


CCT 


GTT 


'i'i'i' 


GCT 


CAC 


CCA 


16^Ala 


Phe 


Cys 


Leu 


Pro 


Val 


Phe 


A! a 


His 


Pro 


4428 GAA 


ACX3 


CTG 


GTG 


AAA 


GTA 


AAA 


GAT 


GCT 


GAA 


26^Glu 


Thr 


Leu 


Val 


Lys 


Val 


Lys Asp 


Ala 


GI u 


4458 GAT 


CAG 


TTG 


GGT 


GCA 


CGA 


GTG 


GGT 


TAC 


ATC 


36^ Asp 


Gi n 


Leu 


Gly 


Ala 


Arg 


Val 


Gly 


Tyr 


1 1 e 


4488 GAA 


CTG 


GAT 


CTC 


AAC 


AGC 


GGT 


AAG 


ATC 


CTT 


46^Glu 


Leu 


Asp 


Leu 


Asn 


Ser 


Gly 


Lys 


1 le 


Leu 


4518 GAG 


AGT 


'i'i'i' 


CGC 


CCC 


GAA 


GAA 


CGT 


'i'i'i' 


CCA 


56^GI u 


Ser 


Phe 


A rg 


Pro 


GI u 


GI u A rg 


Phe 


Pro 


4548 ATG 


ATG 


AGC 


ACT 


TTT 


AAA 


GTT 


CTG 


CTA 


TGT 


66^Met 


Met 


Ser 


Thr 


Phe 


Lys 


Val 


Leu 


Leu 


Cys 
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Figure 14b (cont'd) 

4578 GGC GCG GTA TTA TCC CGT GTT GAC GCC GGG 
76^Gly Ala Val Leu Ser Arg Val Asp Ala Gl y 

4608 CAA GAG CAA CTC GGT CGC CGC ATA CAC TAT 

86^Gln Glu Gin Leu Gly Arg Arg lie His Tyr 

Seal 

4638 TCT CAG AAT GAC TTG GTT GAG TAG TCA CCA 

96^Ser Gin Asn Asp Leu Val Glu Tyr Ser Pro 

4668 GTC ACA GAA AAG CAT CTT ACG GAT GGC ATG 
106^ Val Thr Glu Lys His Leu Thr Asp Gly Met 

4698 ACA GTA AGA GAA TTA TGC AGT GCT GCC ATA 
116^Thr Val Arg Glu Leu Cys Ser Ala Ala lie 

4728 ACC ATG AGT GAT AAC ACT GCG GCC AAC TTA 
126^Thr Met Ser Asp Asn Thr Ala Ala Asn Leu 

4758 CTT CTG ACA ACG ATC GGA GGA CCG AAG GAG 
136^Leu Leu Thr Thr lie Gly Gly Pro Lys Glu 

4788 CTA ACC GCT TTT TTG CAC AAC ATG GGG GAT 
146^ Leu Thr Ala Phe Leu His Asn Met Gly Asp 

4818 CAT GTA ACT CGC CTT GAT CGT TGG GAA CCG 
156^ His Val Thr Arg Leu Asp Arg Trp Glu Pro 

4848 GAG CTG AAT GAA GCC ATA CCA AAC GAC GAG 
166^Glu Leu Asn Gl u Ala Me Pro Asn Asp Gl u 

4878 CGT GAC ACC ACG ATG CCT GTA GCA ATG GCA 
176^Arg Asp Thr Thr Met Pro Val Ala Met Ala 

4908 ACA ACG TTG CGC AAA CTA TTA ACT GGC GAA 
186^ Thr Thr Leu Arg Lys Leu Leu Thr Gly Glu 

4938 CTA Cn ACT CTA GCT TCC CGG CAA CAA TTA 
196^ Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu 
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Figure 14b (cont'd) 



4968 ATA 


GAC 


TGG 


ATG 


GAG 


GCG 


GAT 


AAA 


GTT 


GCA 


206^ lie 


Asp Trp 


Met 


Glu 


A! a 


Asp 


Lys 


Val 


Ala 


4998 GGA 


CCA 


CTT 


CTG 


CGC 


TCG 


GCC 


CTT 


CCG 


GCT 


216> Gl y 


Pro 


Leu 


Leu 


A rg 


Ser 


Ala 


Leu 


Pro 


Ala 


5028 GGC 


TGG 


'i'l'i' 


ATT 


GCT 


GAT 


AAA 


TCT 


GGA 


GCC 


226KGIy 


Trp 


Phe 


lie 


Al a 


Asp 


Lys 


Ser 


Gly Ala 


5058 GGT 


GAG 


CGT 


GGG 


TCT 


CGC 


GGT 


ATC 


ATT 


GCA 


236^Gly 


Gl u A rg 


Gly 


Ser 


Arg 


Gly 


1 le 


1 le 


Ala 


5088 GCA 


CTG 


GGG 


CCA 


GAT 


GGT 


AAG 


ccc 


TCC 


CGT 


246^Ala 


Leu 


Gly 


Pro 


Asp 


Gly 


Lys 


Pro 


Ser 


Arg 


5118 ATC 


GTA 


GTT 


ATC 


TAC 


ACG 


ACG 


GGG 


AGT 


CAG 


256^ lie 


Val 


Val 


Me 


Ty r 


Thr 


Thr 


Gl y 


Ser 


Gin 


5148 GCA 


ACT 


ATG 


GAT 


GAA 


CGA 


AAT 


AGA 


CAG 


ATC 


266^Ala 


Thr 


Met 


Asp 


Glu 


A rg Asn 


Arg 


Gin 


1 le 


5178 GCT 


GAG 


ATA 


GGT 


GCC 


TCA 


CTG 


ATT 


AAG 


CAT 


276^Ala 


Gl u 


i le 


Gly 


Ala 


Ser 


Leu 


1 1 e 


Lys 


Hi s 



5208 TGG TAA CTGTCAQACCAAGTTTACTCATATATACTTT 
286^ Trp • • • 

5245 AGATTGATTTACGCGCCCTGTAGCGGCGCATTAAGCGCG 

5284 GCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTT 

5323 GCCAGCGCCCl'AGCGCCCGCTCCTTTCGCTTTCTTCCCT 

53 62 TCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCT 

5401 CTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCT 

5440 TTACGGCACCTCGACCCCAAAAAACTTGATTTGGGTGAT 

5479 GGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTT 
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Figure 14b (cont'd) 

5518 CGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGG^ 
5557 CTCTTGTTCCAAACTTGAACAACACTC^CCCTATCTCG 
5596 GGCTATTCTTTTGATTTATAAGGGATTTTGC 
5635 GCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTT 
5674 AACGCGAATTTTAACAAAATATTAACGTTTACAATTTAA 
5713 AAGGATCTAGGTGAAGATCCTTTTTGATAATCTCAT^ 
5752 CAAAATCCCTTAACGTGAGTTTTCGTTCCACTGA^ 
5791 AGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCC 
5830 TTTTTTTCTGCGCGTAATCTGCTGCTTGCAAACAAAA^ 
5869 ACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGA 
5908 GCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGC^ 
5947 AGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA 
5986 GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTAC 
6025 ATACCTCGCTCTGCTAATCCTGTTACCAGTGGCTGCTGC 
6064 CAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAG 
6103 ACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAAC 
6142 GGGGGGTTCGTGCACACAGCCCAGCTTGGAGCGAACGAC 
6181 CTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGA 
6220 AAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTA 
6259 TCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAG 
62 98 GGAGCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCC 
6337 TGTCGGGTTTCGCCACCTCTGACTTGAGCGTCGATTTTT 



SUBS7TTUTE SHEET (RULE 26) 



wo 99/29837 



• 

PCT/EP98/07945 



65/65 

Figure 14b (cont'd) 

6376 GTGATGCTCGTCAGGGGGGCGK3AGCCTATGGAAAAACGC 

6415 CAGCAACGCGGCCTITITACGGTTCCTGGCC^^ 

6454 GCCTITIXXrTCACATGTTCTTTCCTC^ 

6493 TTCTCTGGATAACCGTATTACCGCCTTTGAGTGAGCT^ 

6532 TACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTC 

6571 AGTGAGCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTT 

6610 TCTCCTTACGCATCTGTGCGGTATTTCACACCGCATAGG 

6649 GTCATGGCTGCGCCCCGACACCCGCCAACACCCGCTGAC 

6688 GCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTAC 

6727 AGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAG 

6766 AGGTTTTCACCGTCATCACCGAAACGCGCGAGGCAGCAA 

6805 GGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGCCTGC 

6844 CACCATACCCACGCCGAAACAAGCGCTCATGAGCCCGAA 

6883 GTGGCGAGCCCGATCTTCCCCATCGGTGATGTCGGCGAT 

6922 ATAGGCGCCAGCAACCGCACCTGTGGCGCCGGTGATGCC 

6961 GGCC^CGATGCGTCCGGCGTAGAGGATCTGCTCATGTTT 

7000 GACAGCTTATC 



SUBSTTTUTE SHEET (RULE 26) 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



(i) APPLICANT: 

(A) NAME: European Molecular Biology Laboratory 
(EMBL) ^ 

(B) STREET: Meyerhof strasse 1 

(C) CITY: Heidelberg 

(E) COUNTRY: DE 

(F) POSTAL CODE (ZIP) : D-69117 

(ii) TITLE OF INVENTION: Novel DNA Cloning Method 
(iii) NUMBER OF SEQUENCES: 14 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1,30 

(EPO) 



(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 97121562.2 

(B) FILING DATE: 05 -DEC- 1997 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 98118756.0 

(B) FILING DATE: 05-OCT-1998 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6150 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 



(vii) IMMEDIATE SOURCE: 

(B) CLONE: pBAD24-recET 

(ix) FEATURE: 

(A) NAME/KEY: misc^feature 

(B) LOCATION: complement (96.. 974) 

(D) OTHER INFORMATION: /product = "araC" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 1320. .2162 

(D) OTHER INFORMATION: /product = "t-recE" 



wo 99/29837 



PCT/EP98/07945 



2 



(ix) FEATURE: 

(A) NAME/KEY: misc_f eature 

(B) LOCATION:2155, .2972 

(D) OTHER INFORMATION: /products "recT" 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION:3493. .4353 

(D) OTHER INFORMATION: /product = "bla" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
ATCGATGCAT AATGTGCCTG TCAAATGGAC GAAGCAGGGA TTCTGCAAAC CCTATGCTAC 
TCCGTCAAGC CGTCAATTGT CTGATTCGTT ACCAATTATC ACAACTTGAC GGCTACATCA 
TTCACTTTTT CTTCACAACC GGCACGGAAC TCGCTCGGGC TGGCCCCGGT GCATTTTTTA 
AATACCCGCG AGAAATAGAG TTGATCGTCA AAACCAACAT TGCGACCGAC GGTGGCGATA 
GGCATCCGGG TGGT6CTCAA AAGCAGCTTC GCCTGGCTGA TACGTTGGTC CTCGCGCCAG 
CTTAAGACGC TAATCCCTAA CTGCTGGCGG AAAAGATGTG ACAGACGCGA CGGCGACAAG 
CAAACATGCT GTGCGACGCT GGCGATATCA AAATTGCTGT CTGCCAGGTG ATCGCTGATG 
TACTGACAAG CCTCQCGTAC CCGATTATCC ATCGGTGGAT GGAGCGACTC GTTAATCGCT 
TCCATGCGCC GCAGTAACAA TTGCTCAAGC AGATTTATCG CCAGCAGCTC CGAATAGCGC 
CCTTCCCCTT GCCCGGCGTT AATGATTTGC CCAAACAGGT CGCTGAAATG CGGCTGGTGC 
GCTTCATCCG GGCGAAAGAA CCCCGTATTG GCAAATATTG ACGGCCAGTT AAGCCATTCA 
TGCCAGTAGG CGCGCGGACG AAA6TAAACC CACTGGTGAT ACCATTCGCG AGCCTCCGGA 
TGACGACCGT AGTGATGAAT CTCTCCTGGC GGGAACAGCA AAATATCACC CGGTCGGCAA 
ACAAATTCTC GTCCCTGATT TTTCACCACC CCCTGACCGC GAATGGTGAG ATTGAGAATA 
TAACCTTTCA TTCCCAGCGG TCGGTCGATA AAAAAATCGA GATAACCGTT GGCCTCAATC 
GGCGTTAAAC CCGCCACCAG ATGGGCATTA AACGAGTATC CCGGCAGCAG GGGATCATTT 
TGCGCTTCAG CCATACTTTT CATACTCCCG CCATTCAGAG AAGAAACCAA TTGTCCATAT 
TGCATCAGAC ATTGCCGTCA CTGCGTCTTT TACTGGCTCT TCTCGCTAAC CAAACCGGTA 
ACCCCGCTTA TTAAAAGCAT TCTGTAACAA AGCGGGACCA AAGCCATGAC AAAAACGCGT 
AACAAAAGTG TCTATAATCA CGGCAQAAAA GTCCACATTG ATTATTTGCA CGGCGTCACA 
CTTTGCTATG CCATAGCATT TTTATCCATA AGATTAGCGG ATCCTACCTG ACGCTTTTTA 
TCGCAACTCT CTACTGTTTC TCCATACCCG TTTTTTTGGG CTAGCAGGAG GAATTCACCA 
TGGATCCCGT AATCGTAGAA GACATAGAGC CAGGTATTTA TTACGGAATT TCGAATGAGA 
ATTACCACGC GGGTCCCGGT ATCAGTAAGT CTCAGCTCGA TGACATTGCT GATACTCCGG 
CACTATATTT GTGGCGTAAA AATGCCCCCG TGGACACCAC AAAGACAAAA ACGCTCGATT 
TAGGAACTGC TTTCCACTGC CGGGTACTTG AACCGGAAGA ATTCAGTAAC CGCTTTATCG 



120 
180 
240 

300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
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TAGCACCTGA ATTTAACCGC CGTACAAACG CCGGAAAAGA AGAAGAGAAA GCGTTTCTGA 1620 

TGGAATGCGC AAGCACAGGA AAAACGGTTA TCACTGCGGA AGAAGGCCGG AAAATTGAAC 1680 

TCATGTATCA AAGCGTTATG GCTTTGCCGC TGGGGCAATG GCTTGTTGAA AGCGCCGGAC 1740 

ACGCTGAATC ATCAATTTAC TGGGAAGATC CTGAAACAGG AATTTTGTGT CGGTGCCGTC 1800 

CGGACAAAAT TATCCCTGAA TTTCACTGGA TCATGGACGT GAAAACTACG GCGGATATTC 1860 

AACGATTCAA AACCGCTTAT TACGACTACC GCTATCACGT TCAGGATGCA TTCTACAGTG 1920 

ACGGTTATGA AGCACAGTTT GGAGTGCAGC CAACTTTCGT TTTTCTGGTT GCCAGCACAA 1980 

CTATTGAATG CGGACGTTAT CCGGTTGAAA TTTTCATGAT GGGCGAAGAA GCAAAACTGG 2040 

CAGGTCAACA GGAATATCAC CGCAATCTGC GAACCCTGTC TGACTGCCTG AATACCGATG 2100 

AATGGCCAGC TATTAAGACA TTATCACTGC CCCGCTGGGC TAAGGAATAT GCAAATGACT 2160 

AAGCAACCAC CAATCGCAAA AGCCGATCTG CAAAAAACTC AGGGAAACCG TGCACCAGCA 2220 

GCAGTTAAAA ATAGCGACGT GATTAGTTTT ATTAACCAGC CATCAATGAA AGAGCAACTG 2280 

GCAGCAGCTC TTCCACGCCA TATGACGGCT GAACGTATGA TCCGTATCGC CACCACAGAA 2340 

ATTCGTAAAG TTCCGGCGTT AGGAAACTGT GACACTATGA GTTTTGTCAG TGCGATCGTA 2400 

CAGTGTTCAC AGCTCGGACT TGAGCCAGGT AGCGCCCTCG 6TCATGCATA TTTACTGCCT 2460 

TTTGGTAATA AAAACGAAAA GAGCGGTAAA AAGAACGTTC AGCTAATCAT TGGCTATCGC 2520 

GGCATGATTG ATCTGGCTCG CCGTTCTGGT CAAATCGCCA GCCTGTCAGC CCGTGTTGTC 2580 

CGTGAAGGTG ACGAGTTTAG CTTCGAATTT GGCCTTGATG AAAAGTTAAT ACACCGCCCG 2640 

GGAGAAAACG AAGATGCCCC GGTTACCCAC GTCTATGCTG TCGCAAGACT GAAAGACGGA 2700 

GGTACTCAGT TTGAAGTTAT GACGCGCAAA CAGATTGAGC TGGTGCGCAG CCTGAGTAAA 2760 

GCTGGTAATA ACGGGCCGTG GGTAACTCAC TGGGAAGAAA TGGCAAAGAA AACGGCTATT 2820 

CGTCGCCTGT TCAAATATTT GCCCGTATCA ATTGAGATCC AGCGTGCAGT ATCAATGGAT 2880 

GAAAAGGAAC CACTGACAAT CGATCCTGCA GATTCCTCTG TATTAACCGG GGAATACAGT 2940 

GTAATCGATA ATTCAGAGGA ATAGATCTAA GCTTGGCTGT TTTGGCGGAT GAGAGAAGAT 3000 

TTTCAGCCTG ATACAGATTA AATCAGAACG CAGAAGCGGT CTGATAAAAC AGAATTTGCC 3060 

TGGCGGCAGT AGCGCGGTGG TCCCACCTGA CCCCATGCCG AACTCAGAAG TGAAACGCCG 3120 

TAGCGCCGAT GGTAGTGTGG GGTCTCCCCA TGCGAGAGTA GGGAACTGCC AGGCATCAAA 3180 

TAAAACGAAA GGCTCAGTCG AAAGACTGGG CCTTTCGTTT TATCTGTTGT TTGTCGGTGA 3240 

ACGCTCTCCT GAGTAGGACA AATCCGCCGG GAGCGGATTT GAACGTTGCG AAGCAACGGC 3300 

CCGGAGGGTG GCGGGCAGGA CGCCCGCCAT AAACTGCCAG GCATCAAATT AAGCAGAAGG 3360 

CCATCCTGAC GGATGGCCTT TTTGCGTTTC TACAAACTCT TTTGTTTATT TTTCTAAATA 3420 

CATTCAAATA TGTATCCGCT CATGAGACAA TAACCCTGAT AAATGCTTCA ATAATATTGA 3480 

AAAAGGAAGA GTATGAGTAT TCAACATTTC CGTGTCGCCC TTATTCCCTT TTTTGCGGCA 3540 

TTTTGCCTTC CTGTTTTTGC TCACCCAGAA ACGCTGGTGA AAGTA7VAAGA TGCTGAAGAT 3600 
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CAGTTGGGTG 


CACGAGTGGG 


TTACATCGAA CTGGATCTCA ACAt^PRnTAA 


^2 li TP P^Trt 7i f2 
\M\ X X X vaAo 


/r c A 
J Q D U 


AGTTTTCGCC 


CCGAAGAACG 


TTTTCCAATG ATGAGt^ACITT TTaaJ^rSTTPT 


LjL- XAXAjXUVjC 


*5 0 n 


GCGGTATTAT 


CCCGTGTTGA 


CGCCGGGCAA GAGC!AAPTPO CZTrczrmriiT 

wwwwwwwwjv^ wnow^v\wXV>w wx wwwwOXMnl 


AL AU X A xTuT 


0 1 0 A 


CAGAATGACT 


TGGTTGAGTA 


CTCACCARTP APZiGAZiZVZirLP 2iTr"r'T7ir»r?r'?v 


X CaCaL ATGACA 


3840 


GTAAGAGAAT 


TATGCAGTGC 


TGCCATAAPP ATRJimY22iTa ^nTif^'mnr^nn 


LAAL rXACTT 


3 900 


CTGACAACGA 


TCGGAGGACC 


GAAGGAttPTA APPnCTT'Tn^ TPr»iir»2iar'AT 


GGGGGATCAT 


3960 


GTAACTCGCC 


TTGATCGTTG 


^J^'v\v..\-ovj/\\j Ulva/iHioA/iij 


CGACGAGCGT 


4020 


GACACCACGA 


TGCCTGTAGC 




TGGCGAACTA 


4080 


CTTACTCTAG 


CTTCCCGGCA 


Au/iKi lAHiA oAuK^AloG AGGCGGATAA 


AGTTGCAGGA 


4140 


CCACTTCTGC 


GCTCGGCCCT 


i-v-u^^vaUiVT^ju l^jijl liAlxG CTGATAAATC 


TGGAGCCGGT 


4200 


GAGCGTGGGT 


CTCGCGGTAT 


V^MI. XOUAV9V..»A V,1VjWjvjLL.AG ATGGTAAGCC 


CTCCCX3TATC 


4260 


GTAGTTATCT 


ACACGACGGn 


u«\ijrxuAV9VdV^A AL.IA1\jGATG AACGAAATAG 


ACAGATCGCT 


4320 


GAGATAGGTG 


CCTCACTGAT 


TAAGPATTYSG T2ViPTPT»Par« »r»r'a»r"T»*T«n'n 
X /viva v-./\ 1 1 AjVj 1 AAU 1 uAvs AC. LAAGTTT A 


CTCATATATA 


4380 


CTTTAGATTG 


ATTTACGCGC 


CCTGTAGPGG PnpjVTTa3i/!!r» nfy^nr*nnr»Tr* 
wwxox/\o\».vjvj V.^V9^/\l XAAww oV«J\3wwuV9w X w 


xGGTGGTTAC 


4440 


GCGCAGCGTG 


ACCGCTACAC 


TTGCCAGCGC PCTAGPGPPP GPTPPTTTPP 

*>*ww*-fcwwww wwxnwwwwww X WW X X X wVj 


u X i 1 u 1 xuuU 


>l C A A 

4500 


TTCCTTTCTC 


GCCACGTTCG 


CCGGCTTTCC CCGTPAAGPT PTAa^TPfM^ 
wwww^A^^ww wwwxwnrujr^A wXAnriX^wwu 


uUL. X L.L.UXXX 


4560 


AGGGTTCCGA 


TTTAGTGCTT 


TACGGCACCT PGAPPPPAaa aaikpn*iY2iv»T»r 

A«-»^wwwnwwx wwnwv*wv*Anrl /iri/\w X X wAX X 


fpo/Tionv^ TV f/^r^ 
X uiiva X\aAxGC9 


>l f 0 A 

4620 


TTCACGTAGT 


GGGCCATCGC 


CCTGATAGAP fifiTTTTTPrtP PPTvrr'ar'r'T 
wwxv3Axri\3/\w VTwX X X X Xv«u\^ L>L> 1 X xUA^^VaX 


X\9CsAGTUCAC 


4680 


GTTCTTTAAT 


AGTGGACTCT 


TGTTPPAAAP TTG2V1VPA11P2V nTr»aJvor»r^TTi 
X ux X v-wviAU X X VaAAUAAuA U X LAAv«CCTA 


TCTCGGGCTA 


4740 


TTCTTTTGAT 


TTATAAGG6A 


XXX x\aUWvaAX X X djvaul.. 1 AX xajGxT AAAAA 


IV 111/ IV ^/'llll/ 1 VI f¥l 

ATGAGCTGAT 


4800 


TTAACAAAAA 


TTTAACGCGA 


rtx X X ±iM\\.^i\i\ AAXAX X AAL.V3 TTTACAATTT 


TV TV n TV f^f^ TV ^n^m 

AAAAGGATCT 


4860 


AGGTGAAGAT 


CCTTTTTGAT 


nnx^vXUAXVui u\.»AAAAXv„Uw X XAALVaTGAG 


TTTTCGTTCC 


4920 


ACTGAGCGTC 


AGACCCCGTA 


GAAAAGATPA liIlfil/^2i'PPT*PP TTf^'^nnTrinn* 


TTTTTTCTGC 


4980 


GCGTAATCTG 




MV.AAAAAAAC UAUCv^U xACC AGwGGTGGTT 


TGTTTGCCGG 


5040 


ATCAAGAGCT 


ACCAACTCTT 


XXX u uuAAVsVa X AAU xuVsC l^T CAGCAGAGCG 


/^TV^TVtliTV A^TV TV 

CAGATACCAA 


5100 


ATACTGTCCT 


TCTAGTGTA(3 


w^ox/iwx XAva uLtuAuuACxT CAAGAACTCT 


GTAGCACCGC 


5160 


CTACATACCT 


CGCTOTGCTA 

www A w Jk WW An 


ATPP'PdT'PIVP Pikr2fv^r"ivr^ iv^^^nrwrv^/^r^ 
A X X \ j X X AV. V. A^ X v^oCxoU TGG CAGTGGC 


^TVmTl TV S 1IH^1J"IHI 

GATAAGTCGT 


5220 


GTCTTACCGG 


GTTGGACTCA 
w A * wwnw X wn 


AG&PrS&'F&A'P Tur'f^/^^R'Pim o<^o/^f^*v^r^^r^ 
AvaAU\aAX Alar X XAL.UV9V3ATAA GGCGCAGCGG 


TCGGGCTGAA 


5280 




GTGCACACAG 


CCCAGCTTGG AGCGAACGAC CTACACCGAA 


CT6AGATACC 


5340 


TACAGCGTGA 


GCTATGAGAA 


AGCGCCACGC TTCCCGAAGG GAGAAAGGCG 


GACAGGTATC 


5400 


CGGTAAGCGG 


CAGGGTCGGA 


ACAGGAGAGC GCACGAGGGA GCTTCCAGGG 


GGAAACGCCT 


5460 


GGTATCTTTA 


TAGTCCTGTC 


GGGTTTCGCC ACCTCTGACT TGAGCGTCGA 


TTTTTGTGAT 


5520 


GCTCGTCAGG 


GGGGCGGAGC 


CTATGGAAAA ACGCCAGCAA CGCGGCCTTT 


TTACGGTTCC 


5580 


TGGCCTTTTG 


CTGGCCTTTT 


GCTCACATGT TCTTTCCTGC GTTATCCCCT 


GATTCTGTGG 


5640 
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ATAACCGTAT 


TACCGCCTTT 


GAGTGAGCTG ATACCGPrff^ 




5700 


GCAGCGAGTC 


AGTGAGCGAG 


GAAGCGGAAG AGCGCrTRAT 


^O^/'J'P TV 'IMIMIMM r^fT*f^ /"If 1 ¥ p 7\ /^/^ 

VjCkjVjil Ai X 1 i v-xCCliACGC 


5760 


ATCTGTGCGG 


TATTTCACAC 


CGCATAGGGT CATnnPTrir'r 


V-CUL-QsACACC CGCCAACACC 


5820 


www -L \3'r\\^\J\^\j 




CTTGTCTGCT CCCGGCATCC 


GCTTACAGAC AAGCTGTGAC 


5880 


CGTCTCCGGG 


AGCTGCATGT 


GTCAGAGGTT TTCACCGTCA 


TCACCGAAAC GCGCGAGGCA 


5940 


GCAAGGAGAT 


GGCGCCCAAC 


AGTCCCCCGG CCACGGGGCC 


TGCCACCATA CCCACGCCGA 


6000 


AACAAGCGCT 


CATGAGCCCG 


AAGTGGCGAG CCCGATCTTC 


CCCATCGGTG ATGTCGGCGA 


6060 


TATAGGCGCC 


AGCAACCGCA 


CCTGTGGCGC CGGTGATGCC 


GGCCACGATG CGTCCGGCGT 


6120 


AGAGGATCTG 


CTCATGTTTG 


ACAGCTTATC 




6150 



(2) INFORMATION FOR SEQ ID NO; 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 843 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: t-recE 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION:!. ,843 

(D) OTHER INFORMATION: /product = "t-recE" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATG GAT CCC GTA ATC GTA GAA GAG ATA GAG CCA GGT ATT TAT TAG GGA 48 
Met Asp Pro Val He Val Glu Asp He Glu Pro Gly He Tyr Tyr Gly 
3-5 10 15 

ATT TCG AAT GAG AAT TAG CAC GCG GGT CCC GGT ATG AGT AAO TCT GAG 96 
He Ser Asn Glu Asn Tyr His Ala Gly Pro Gly He Ser Lys Ser Gin 
20 25 30 

CTC GAT GAC ATT GGT GAT AGT CCG GCA CTA TAT TTG TGG GGT AAA AAT 144 
Leu Asp Asp He Ala Asp Thr Pro Ala Leu Tyr Leu Trp Arg Lys Asn 
35 40 45 

GCG CCC GTG GAC AGG ACA AAG ACA AAA ACG CTC GAT TTA GGA ACT GGT 192 
Ala Pro Val Asp Thr Thr Lys Thr Lys Thr Leu Asp Leu Gly Thr Ala 
50 55 60 

TTG CAC TGC CGG GTA CTT GAA CCG GAA GAA TTG AGT AAC CGC TTT ATC 240 
Phe His Cys Arg Val Leu Glu Pro Glu Glu Phe Ser Asn Arg Phe He 

70 75 80 

GTA GCA CCT GAA TTT AAC CGC GGT ACA AAC GGC GGA AAA GAA GAA GAG 288 
Val Ala Pro Glu Phe Asn Arg Arg Thr Asn Ala Gly Lys Glu Glu Glu 
85 90 95 

AAA GCG TTT CiG ATG GAA TGG GGA AGG ACA GGA AAA AGG GTT ATC ACT 336 
Lys Ala Phe Leu Met Glu Cys Ala Ser Thr Gly Lys Thr Val He Thr 
100 105 110 
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GCG GAA GAA GGC CGG AAA ATT GAA CTC ATG TAT CAA AGC GTT ATG GCT 384 
Ala Glu Glu Gly Arg Lys He Glu Leu Met Tyr Gin Ser Val Met Ala 
115 120 125 

TTG CCG CTG GGG CAA TGG CTT GTT GAA AGC GCC GGA CAC GCT GAA TCA 432 
Leu Pro Leu Gly Gin Trp Leu Val Glu Ser Ala Gly His Ala Glu Ser 
130 135 140 

TCA ATT TAC TGG GAA GAT CCT GAA ACA GGA ATT TTG TGT CGG TGC CGT 480 
Ser He Tyr Trp Glu Asp Pro Glu Thr Gly He Leu Cys Arg Cys Arg 
145 150 155 

CCG GAC AAA ATT ATC CCT GAA TTT CAC TGG ATC ATG GAC GTG AAA ACT 528 
Pro Asp Lys He He Pro Glu Phe His Trp He Met Asp Val Lys Thr 
165 170 175 

ACG GCG GAT ATT CAA CGA TTC AAA ACC GCT TAT TAC GAC TAC CGC TAT 576 
Thr Ala Asp He Gin Arg Phe Lys Thr Ala Tyr Tyr Asp Tyr Arg Tyr 
180 185 190 

CAC GTT CAG GAT GCA TTC TAC AGT GAC GGT TAT GAA GCA CAG TTT GGA 624 
His Val Gin Asp Ala Phe Tyr Ser Asp Gly Tyr Glu Ala Gin Phe Gly 
195 200 205 

GTG CAG CCA ACT TTC GTT TTT CTG GTT GCC AGC ACA ACT ATT GAA TGC 672 
Val Gin Pro Thr Phe Val Phe Leu Val Ala Ser Thr Thr He Glu Cys 
210 215 220 

GGA CGT TAT CCG GTT GAA ATT TTC ATG ATG GGC GAA GAA GCA AAA CTG 720 
Gly Arg Tyr Pro Val Glu He Phe Met Met Gly Glu Glu Ala Lys Leu 
225 230 235 240 

GCA GGT CAA CAG GAA TAT CAC CGC AAT CTG CGA ACC CTG TCT GAC TGC 768 
Ala Gly Gin Gin Glu Tyr His Arg Asn Leu Arg Thr Leu Ser Asp Cys 
245 250 255 

CTG AAT ACC GAT GAA TGG CCA GCT ATT AAG ACA TTA TCA CTG CCC CGC 816 
Leu Asn Thr Asp Glu Trp Pro Ala He Lys Thr Leu Ser Leu Pro Arg 
260 265 270 

TGG GCT AAG GAA TAT GCA AAT GAC TAA 843 
Trp Ala Lys Glu Tyr Ala Asn Asp * 
275 280 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 281 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Met Asp Pro Val He Val Glu Asp He Glu Pro Gly He Tyr Tyr Gly 
' ^ 5 10 15 

He Ser Asn Glu Asn Tyr His Ala Gly Pro Gly He Ser Lys Ser Gin 
20 25 30 

Leu Asp Asp He Ala Asp Thr Pro Ala Leu Tyr Leu Trp Arg Lys Asn 
35 40 45 

Ala Pro Val Asp Thr Thr Lys Thr Lys Thr Leu Asp Leu Gly Thr Ala 
50 55 60 
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Phe His Cys Arg Val Leu Glu Pro Glu Glu Phe Ser Asn Arg Phe He 
65 70 75 80 

Val Ala Pro Glu Phe Asn Arg Arg Thr Asn Ala Gly Lys Glu Glu Glu 
85 90 95 

Lys Ala Phe Leu Met Glu Cys Ala Ser Thr Gly Lys Thr Val He Thr 
100 105 110 

Ala Glu Glu Gly Arg Lys He Glu Leu Met Tyr Gin Ser Val Met Ala 
115 120 125 

Leu Pro Leu Gly Gin Trp Leu Val Glu Ser Ala Gly His Ala Glu Ser 
130 135 140 

Ser He Tyr Trp Glu Asp Pro Glu Thr Gly He Leu Cys Arg Cys Arg 
145 150 155 160 

Pro Asp Lys He He Pro Glu Phe His Trp He Met Asp Val Lys Thr 
165 170 175 

Thr Ala Asp He Gin Arg Phe Lys Thr Ala Tyr Tyr Asp Tyr Arg Tyr 
180 185 190 

His Val Gin Asp Ala Phe Tyr Ser Asp Gly Tyr Glu Ala Gin Phe Gly 
195 200 205 

Val Gin Pro Thr Phe Val Phe Leu Val Ala Ser Thr Thr He Glu Cys 
210 215 220 

Gly Arg Tyr Pro Val Glu He Phe Met Met Gly Glu Glu Ala Lys Leu 
225 230 235 240 

Ala Gly Gin Gin Glu Tyr His Arg Asn Leu Arg Thr Leu Ser Asp Cys 
245 250 255 

Leu Asn Thr Asp Glu Trp Pro Ala He Lys Thr Leu Ser Leu Pro Arg 
260 265 270 

Trp Ala Lys Glu Tyr Ala Asn Asp * 



(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 810 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: recT 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1. .810 

(D) OTHER INFORMATION: /product = "recT" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

ATG ACT AAG CAA CCA CCA ATC GCA AAA GCC GAT CTG CAA AAA ACT CAG 
Met Thr Lys Gin Pro Pro He Ala Lys Ala Asp Leu Gin Lys Thr Gin 



275 



280 



285 



290 
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GGA AAC CGT GCA CCA GCA GCA GTT AAA AAT AGO GAC GTG ATT AGT TTT 96 
Gly Asn Arg Ala Pro Ala Ala Val Lys Asn Ser Asp Val He Ser Phe 
300 305 310 

ATT AAC CAG CCA TCA ATG AAA GAG CAA CTG GCA GCA GCT CTT CCA CGC 144 
He Asn Gin Pro Ser Met Lys Glu Gin Leu Ala Ala Ala Leu Pro Arg 
315 320 325 

CAT ATG ACG GCT GAA CGT ATG ATC CGT ATC GCC ACC ACA GAA ATT CGT 192 
His Met Thr Ala Glu Arg Met He Arg He Ala Thr Thr Glu He Arg 
330 335 340 345 



AAA GTT CCG GCG TTA GGA AAC TGT GAC ACT ATG AGT TTT GTC AGT GCG 240 
Lys Val Pro Ala Leu Gly Asn Cys Asp Thr Met Ser Phe Val Ser Ala 
350 355 360 

ATC GTA CAG TGT TCA CAG CTC GGA CTT GAG CCA GGT AGC GCC CTC GGT 288 
He Val Gin Cys Ser Gin Leu Gly Leu Glu Pro Gly Ser Ala Leu Gly 
365 370 375 

CAT GCA TAT TTA CTG CCT TTT GGT AAT AAA AAC GAA AAG AGC GGT AAA 336 
His Ala Tyr Leu Leu Pro Phe Gly Asn Lys Asn Glu Lys Ser Gly Lys 
380 385 390 

AAG AAC GTT CAG CTA ATC ATT GGC TAT CGC GGC ATG ATT GAT CTG GCT 384 
Lys Asn Val Gin Leu He He Gly Tyr Arg Gly Met He Asp Leu Ala 
395 400 405 

CGC CGT TCT GGT CAA ATC GCC AGC CTG TCA GCC CGT GTT GTC CGT GAA 432 
Arg Arg Ser Gly Gin He Ala Ser Leu Ser Ala Arg Val Val Arg Glu 
410 415 420 425 

GGT GAC GAG TTT AGC TTC GAA TTT GGC CTT GAT GAA AAG TTA ATA CAC 480 
Gly Asp Glu Phe Ser Phe Glu Phe Gly Leu Asp Glu Lys Leu He His 
430 435 440 

CGC CCG GGA GAA AAC GAA GAT GCC CCG GTT ACC CAC GTC TAT GCT GTC 528 
Arg Pro Gly Glu Asn Glu Asp Ala Pro Val Thr His Val Tyr Ala Val 
445 450 455 

GCA AGA CTG AAA GAC GGA GGT ACT CAG TTT GAA GTT ATG ACG CGC AAA 576 
Ala Arg Leu Lys Asp Gly Gly Thr Gin Phe Glu Val Met Thr Arg Lys 
460 465 470 

CAG ATT GAG CTG GTG CGC AGC CTG AGT ATVA GCT GGT AAT AAC GGG CCG 624 
Gin He Glu Leu Val Arg Ser Leu Ser Lys Ala Gly Asn Asn Gly Pro 
475 480 485 

TGG GTA ACT CAC TGG GAA GAA ATG GCA AAG AAA ACG GCT ATT CGT CGC 672 
Trp Val Thr His Trp Glu Glu Met Ala Lys Lys Thr Ala He Arg Arg 
490 495 500 505 

CTG TTC AAA TAT TTG CCC GTA TCA ATT GAG ATC CAG CGT GCA GTA TCA 720 
Leu Phe Lys Tyr Leu Pro Val Ser He Glu He Gin Arg Ala Val Ser 
510 515 520 

ATG GAT GAA AAG GAA CCA CTG ACA ATC GAT CCT GCA GAT TCC TCT GTA 768 
Met Asp Glu Lys Glu Pro Leu Thr He Asp Pro Ala Asp Ser Ser Val 
525 530 535 

TTA ACC GGG GAA TAC AGT GTA ATC GAT AAT TCA GAG GAA TAG 810 
Leu Thr Gly Glu Tyr Ser Val He Asp Asn Ser Glu Glu * 
540 545 550 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 270 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear • 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Thr Lys Gin Pro Pro lie Ala Lys Ala Asp Leu Gin Lys Thr Gin 
^5 10 15 

Gly Asn Arg Ala Pro Ala Ala Val Lys Asn Ser Asp Val lie Ser Phe 
20 25 30 



He Asn Gin Pro Ser Met Lys Glu Gin Leu Ala Ala Ala Leu Pro 



His Met Thr Ala Glu Arg Met lie Arg He Ala Thr Thr Glu He Arq 
50 55 60 

Lys Val Pro Ala Leu Gly Asn Cys Asp Thr Met Ser Phe Val Ser Ala 
^5 70 75 80 

He Val Gin Cys Ser Gin Leu Gly Leu Glu Pro Gly Ser Ala Leu Gly 
85 90 95 

His Ala Tyr Leu Leu Pro Phe Gly Asn Lys Asn Glu Lys Ser Gly Lys 
100 105 110 

Lys Asn Val Gin Leu He He Gly Tyr Arg Gly Met He Asp Leu Ala 
115 120 125 

Arg Arg Ser Gly Gin He Ala Ser Leu Ser Ala Arg Val Val Arg Glu 
130 135 140 

Gly Asp Glu Phe Ser Phe Glu Phe Gly Leu Asp Glu Lys Leu He His 
1^5 150 155 160 

Arg Pro Gly Glu Asn Glu Asp Ala Pro Val Thr His Val Tyr Ala Val 
165 170 175 

Ala Arg Leu Lys Asp Gly Gly Thr Gin Phe Glu Val Met Thr Arg Lys 
180 185 190 

Gin He Glu Leu Val Arg Ser Leu Ser Lys Ala Gly Asn Asn Gly Pro 
195 200 205 

Trp Val Thr His Trp Glu Glu Met Ala Lys Lys Thr Ala He Arg Arg 
210 215 220 

Leu Phe Lys Tyr Leu Pro Val Ser He Glu He Gin Arg Ala Val Ser 
225 230 235 240 

Met Asp Glu Lys Glu Pro Leu Thr He Asp Pro Ala Asp Ser Ser Val 
245 250 255 

Leu Thr Gly Glu Tyr Ser Val He Asp Asn Ser Glu Glu * 
260 265 270 



(2) INFORMATION FOR SEQ ID NO: 6: 



35 



40 



45 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: araC 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: complement (1..876) 

(D) OTHER INFORMATION: /product = "araC" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



TGACAACTTG ACGGCTACAT CATTCACTTT TTCTTCACAA CCGGCACGGA 


ACTCGCTCGG 


60 


GCTGGCCCCG GTGCATTTTT TAAATACCCG CGAGAAATAG AGTTGATCGT 


CAAAACCAAC 


120 


ATTGCGACCG ACGGTGGCGA TAGGCATCCG GGTGGTGCTC AAAAGCAGCT 


TCGCCTGGCT 


IBO 


GATACGTTGG TCCTCGCGCC AGCTTAAGAC GCTAATCCCT AACTGCTGGC 


GGAAAAGATG 


240 


TGACAGACGC GACGGCGACA AGCAAACATG CTGTGCGACG CTGGCGATAT 


CAAAATTGCT 


300 


GTCTGCCAGG TGATCGCTGA TGTACTGACA AGCCTCGCGT ACCCGATTAT 


CCATCGGTGG 


360 


ATGGAGCGAC TCGTTAATCG CTTCCATGCG CCGCAGTAAC AATTGCTCAA 


GCAGATTTAT 


420 


CGCCAdCAGC TCCGAATAGC GCCCTTCCCC TTGCCCGGCG TTAATGATTT 


GCCCAAACAG 


480 


GTCGCTGAAA TGCGGCTGGT GCGCTTCATC CGGGCGAAAG AACCCCGTAT 


TGGCAAATAT 


540 


TGACGGCCAG TTAAGCCATT CATGCCAGTA GGCGCGCGGA CGAAAGTAAA 


CCCACTGGTG 


600 


ATACCATTCG CGAGCCTCCG GATGAC6ACC GTAGTGATGA ATCTCTCCTG 


GCGGGAACAG 


660 


CAAAATATCA CCCGGTCGGC AAACAAATTC TCGTCCCTGA TTTTTCACCA 


CCCCCTGACC 


720 


GCGTVATGGTG AGATTGAGAA TATAACCTTT CATTCCCAGC GGTCGGTCGA 


TAAAAAAATC 


780 


GAGATAACCG TTGGCCTCAA TCGGCGTTAA ACCCGCCACC AGATGGGCAT 


TAAACGAGTA 


840 


TCCCGGCAGC AGGGGATCAT TTTGCGCTTC AGCCAT 




876 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 292 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Ala Glu Ala Gin Asn Asp Pro Leu Leu Pro Gly Tyr Ser Phe Asn 
1 5 10 15 

Ala His Leu Val Ala Gly Leu Thr Pro He Glu Ala Asn Gly Tyr Leu 
20 25 30 
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Asp Phe Phe lie Asp Arg Pro Leu Gly Met Lys Gly Tyr He Leu Asn 
35 40 45 

Leu Thr He Arg Gly Gin Gly Val Val Lys Asn Gin Gly Arg Glu Phe 
50 55 60 

Val Cys Arg Pro Gly Asp He Leu Leu Phe Pro Pro Gly Glu He His 
^5 70 75 80 

His Tyr Gly Arg His Pro Glu Ala Arg Glu Trp Tyr His Gin Trp Val 
85 90 95 

Tyr Phe Arg Pro Arg Ala Tyr Trp His Glu Trp Leu Asn Trp Pro Ser 
100 105 110 

He Phe Ala Asn Thr Gly Phe Phe Arg Pro Asp Glu Ala His Gin Pro 
H5 120 125 

His Phe Ser Asp Leu Phe Gly Gin He He Asn Ala Gly Gin Gly Glu 
130 135 140 

Gly Arg Tyr Ser Glu Leu Leu Ala He Asn Leu Leu Glu Gin Leu Leu 

150 155 160 

Leu Arg Arg Met Glu Ala He Asn Glu Ser Leu His Pro Pro Met Asp 
165 170 175 

Asn Arg Val Arg Glu Ala Cys Gin Tyr He Ser Asp His Leu Ala Asp 
180 185 190 

Ser Asn Phe Asp He Ala Ser Val Ala Gin His Val Cys Leu Ser Pro 
195 200 205 

Ser Arg Leu Ser His Leu Phe Arg Gin Gin Leu Gly He Ser Val Leu 
210 215 220 

Ser Trp Arg Glu Asp Gin Arg He Ser Gin Ala Lys Leu Leu Leu Ser 
225 230 235 240 

Thr Thr Arg Met Pro He Ala Thr Val Gly Arg Asn Val Gly Phe Asp 
245 250 255 

Asp Gin Leu Tyr Phe Ser Arg Val Phe Lys Lys Cys Thr Gly Ala Ser 
260 265 270 

Pro Ser Glu Phe Arg Ala Gly Cys Glu Glu Lys Val Asn Asp Val Ala 
275 280 285 

Val Lys Leu Ser 
290 



(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 861 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 



(vii) 



IMMEDIATE SOURCE: 
(B) CLONE: bla 
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(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION:!, .861 

(D) OTHER INFORMATION: /product = "bla" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

ATG AGT ATT CAA CAT TTC CGT GTC GCC CTT ATT CCC TTT TTT GCG GCA 48 
Met Ser lie Gin His Phe Arg Val Ala Leu He Pro Phe Phe Ala Ala 
295 300 305 

TTT TGC CTT CCT GTT TTT GCT CAC CCA GAA ACG CTG GTG AAA GTA AAA 96 
Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lys 
310 315 320 

GAT GCT GAA GAT CAG TTG GGT GCA CGA GTG GGT TAG ATC GAA CTG GAT 144 
Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr He Glu Leu Asp 
325 330 335 340 

CTC AAC AGC GGT AAG ATC CTT GAG AGT TTT CGC CCC GAA GAA CGT TTT 192 
Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
345 350 355 

CCA ATG ATG AGC ACT TTT AAA GTT CTG CTA TGT GGC GCG GTA TTA TCC 240 
Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 
360 365 370 

CGT GTT GAC GCC GGG CAA GAG CAA CTC GGT CGC CGC ATA CAC TAT TCT 288 
Arg Val Asp Ala Gly Gin Glu Gin Leu Gly Arg Arg He His Tyr Ser 
375 380 385 

CAG AAT GAC TTG GTT GAG TAC TCA CCA GTC ACA GAA AAG CAT CTT ACG 336 
Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Leu Thr 
390 395 400 

GAT GGC ATG ACA GTA AGA GAA TTA TGC AGT GCT GCC ATA ACC ATG AGT 384 
Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala He Thr Met Ser 
405 410 415 420 

GAT AAC ACT GCG GCC AAC TTA CTT CTG ACA ACG ATC GGA GGA CCG AAG 432 
Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Gly Pro Lys 
425 430 435 

GAG CTA ACC GCT TTT TTG CAC AAC ATG GGG GAT CAT GTA ACT CGC CTT 480 
Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr Arg Leu 
440 445 450 

GAT CGT TGG GAA CCG GAG CTG AAT GAA GCC ATA CCA AAC GAC GAG CGT 528 
Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu Arg 
455 460 465 

GAC ACC ACG ATG CCT GTA GCA ATG GCA ACA ACG TTG CGC AAA CTA TTA 576 
Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
470 475 480 

ACT GGC GAA CTA CTT ACT CTA GCT TCC CGG CAA CAA TTA ATA GAC TGG 624 
Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp 
485 490 495 500 

ATG GAG GCG GAT AAA GTT GCA GGA CCA CTT CTG CGC TCG GCC CTT CCG 672 
Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 
505 510 515 

GCT GGC TGG TTT ATT GCT GAT AAA TCT GGA GCC GGT GAG CGT GGG TCT 720 
Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser 
520 525 530 
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CGC GGT ATC ATT GCA OCA CTG GGG CCA GAT GGT AAG CCC TCC CGT ATC 768 
Arg Gly lie lie Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg lie 
535 540 545 

GTA GTT ATC TAG ACG ACG GOG ACT CAG GCA ACT ATG GAT GAA CGA AAT 816 
Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn 
550 555 560 

AGA CAG ATC GCT GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG TAA 861 
Arg Gin He Ala Glu He Gly Ala Ser Leu He Lys His Trp * 
565 570 575 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 287 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Ser lie Gin His Phe Arg Val Ala Leu lie Pro Phe Phe Ala Ala 
15 10 15 

Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lys 
20 25 30 

Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr He Glu Leu Asp 

35 40 45 

Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
50 55 60 

Pro Met Met Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 
65 70 75 80 

Arg Val Asp Ala Gly Gin Glu Gin Leu Gly Arg Arg He His Tyr Ser 
85 90 95 

Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Leu Thr 
100 105 110 

Asp Gly Met Thr Val Arg Glu Leu Cys Ser Ala Ala He Thr Met Ser 
115 120 125 

Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Gly Pro Lys 
130 135 140 

Glu Leu Thr Ala Phe Leu His Asn Met Gly Asp His Val Thr Arg Leu 
145 150 155 160 

Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu Arg 
165 170 175 

Asp Thr Thr Met Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
180 185 190 

Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp 
195 200 205 

Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 
210 215 220 

Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser 
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225 230 235 240 

Arg Gly He He Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg lie 
245 250 255 

Val Val He Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn 
260 265 270 

Arg Gin He Ala Glu He Gly Ala Ser Leu He Lys His Trp * 
275 280 285 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7195 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: pBAD-ETgamma 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 3588 . .4004 

(D) OTHER INFORMATION: /product = "red gamma" 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID NO: 10: 






ATCGATGCAT 


AATGTGCCTG 


TCAAATGGAC 


GAAGCAGGGA 


TTCTGCAAAC 


CCTATGCTAC 


60 


TCCGTCAAGC 


CGTCAATTGT 


CTGATTCGTT 


ACCAATTATG 


ACAACTTGAC 


GGCTACATCA 


120 


TTCACTTTTT 


CTTCACAACC 


GGCACGGAAC 


TCGCTCGGGC 


TGGCCCCGGT 


GCATTTTTTA 


180 


AATACCCGCG 


AGAAATAGAG 


TTGATCGTCA 


AAACCAACAT 


TGCGACCGAC 


GGTGGCGATA 


240 


GGCATCCGGG 


TGGTGCTCAA 


AAGCAGCTTC 


GCCTGGCTGA 


TACGTTGGTC 


CTCGCGCCAG 


300 


CTTAAGACGC 


TAATCCCTAA 


CTGCTGGCGG 


AAAAGATGTG 


ACAGACGCGA 


CGGCGACAAG 


360 


CAAACATGCT 


GTGCGACGCT 


GGCGATATCA 


AAATTGCTGT 


CTGCCAGGTG 


ATCGCTGATG 


420 


TACTGACAAG 


CCTCGCGTAC 


CCGATTATCC 


ATCGGTGGAT 


GGAGCGACTC 


GTTAATCGCT 


480 


TCCATGCGCC 


GCAGTAACAA 


TTGCTCAAGC 


AGATTTATCG 


CCAGCAGCTC 


CGAATAGCGC 


54 0 


CCTTCCCCTT 


GCCCGGCGTT 


AATGATTTGC 


CCAAACAGGT 


CGCTGAAATG 


CGGCTGGTGC 


600 


GCTTCATCCG 


GGCGAAAGAA 


CCCCGTATTG 


GCAAATATTG 


ACGGCCAGTT 


AAGCCATTCA 


660 


TGCCAGTAGG 


CGCGCGGACG 


AAAGTAAACC 


CACTGGTGAT 


ACCATTCGCG 


AGCCTCCGGA 


720 


TGACGACCGT 


AGTGATGAAT 


CTCTCCTGGC 


GGGAACAGCA 


AAATATCACC 


CGGTCGGCAA 


780 


ACAAATTCTC 


GTCCCTGATT 


TTTCACCACC 


CCCTGACCGC 


GAATGGTGAG 


ATTGAGAATA 


840 


TAACCTTTCA 


TTCCCAGCGG 


TCGGTCGATA 


AAAAAATCGA 


GATAACCGTT 


GGCCTCAATC 


900 


GGCGTTAAAC 


CCGCCACCAG 


ATGGGCATTA 


AACGAGTATC 


CCGGCAGCAG 


GGGATCATTT 


960 


TGCGCTTCAG 


CCATACTTTT 


CATACTCCCG 


CCATTCAGAG 


AAGAAACCAA 


TTGTCCATAT 


1020 
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TGCATCAGAC 


ATTGCCGTCA 


CTGCGTCTTT TACTGGCTCT 


TCTCGCTAAC 


CAAACCGGTA 


1080 


ACCCCGCTTA 


TTAAAA6CAT 


TCTGTAACAA AGCGGGACCA 


AAGCCATGAC 


AAAAACGCGT 


1140 


AACAAAAGTG 


TCTATAATCA 


CGGCAGAAAA GTCCACATTG 


ATTATTTGCA 


CGGCGTCACA 


1200 


CTTTGCTATG 


CCATAGCATT 


TTTATCCATA AGATTAGCGG 


ATCCTACCTG 


ACGCTTTTTA 


1260 


TCGCAACTCT 


CTACTGTTTC 


TCCATACCCG TTTTTTTGGG 


CTAGCAGGAG 


GAATTCACCA 


1320 


TGGATCCCGT 


AATCGTAGAA 


GACATAGAGC CAGGTATTTA 


TTACGGAATT 


TCGAATGAGA 


1380 


ATTACCACGC 


GGGTCCCGGT 


ATCAGTAAGT CTCAGCTCGA 


TGACATTGCT 


GATACTCCGG 


1440 


CACTATATTT 


GTGGCGTAAA 


AATGCCCCCG TGGACACCAC 


AAAGACAAAA 


ACGCTCGATT 


1500 


TAGGAACTGC 


TTTCCACTGC 


CGGGTACTTG AACCGGAAGA 


ATTCAGTAAC 


CGCTTTATCG 


1560 


TAGCACCTGA 


ATTTAACCGC 


CGTACAAACG CCGGAAAAGA 


AGAAGAGAAA 


GCGTTTCTGA 


1620 


TGGAATGCGC 


AAGCACAGGA 


AAAACGGTTA TCACTGCGGA 


AGAAGGCCGG 


AAAATTGAAC 


1680 


TCATGTATCA 


AAGCGTTATG 


GCTTTGCCGC TGGGGCAATG 


GCTTGTTGAA 


AGCGCCGGAC 


1740 


ACGCTGAATC 


ATCAATTTAC 


TGGGAAGATC CTGAAACAGG 


AATTTTGTGT 


CGGTGCCGTC 


1800 


CGGACAAAAT 


TATCCCTGAA 


TTTCACTGGA TCATGGACGT 


GAAAACTACG 


GCGGATATTC 


1860 


AACGATTCAA 


AACCGCTTAT 


TACGACTACC GCTATCACGT 


TCAGGATGCA 


TTCTACAGTG 


1920 


ACGGTTATGA 


AGCACAGTTT 


GGAGTGCAGC CAACTTTCGT 


TTTTCTGGTT 


GCCAGCACAA 


1980 


CTATTGAATG 


CGGACGTTAT 


CCGGTTGATVA TTTTCATGAT 


GGGCGAAGAA 


GCAAAACTGG 


2040 


CAGGTCAACA 


GGAATATCAC 


CGCAATCTGC GAACCCTGTC 


TGACTGCCTG 


AATACCGATG 


2100 


AATGGCCAGC 


TATTAAGACA 


TTATCACTGC CCCGCTGGGC 


TAAGGAATAT 


GCAAATGACT 


2160 


AGATCTCGAG 


GTACCCGAGC 


ACGTGTTGAC AATTAATCAT 


CGGCATAGTA 


TATCGGCATA 


2220 


GTATAATACG 


ACAAGGTGAG 


GAACTAAACC ATGGCTAAGC 


AACCACCAAT 


CGCAAAAGCC 


2280 


GATCTGCAAA 


AAACTCAGGG 


AAACCGTGCA CCAGCAGCAG 


TTAAAAATAG 


CGACGTGATT 


2340 


AGTTTTATTA 


ACCAGCCATC 


AATGAAAGAG CAACTGGCAG 


CAGCTCTTCC 


ACGCCATATG 


2400 


ACGGCTGAAC 


GTATGATCCG 


TATCGCCACC ACAGAAATTC 


GTAAAGTTCC 


GGCGTTAGGA 


2460 


AACTGTGACA 


CTATGAGTTT 


TGTCAGTGCG ATCGTACAGT 


GTTCACAGCT 


CGGACTTGAG 


2520 


CCAGGTAGCG 


CCCTCGGTCA 


TGCATATTTA CTGCCTTTTG 


GTAATAAA7UV 


CGAAAAGAGC 


2580 


GGTAAAAAGA 


ACGTTCAGCT 


AATCATTGGC TATCGCGGCA 


TGATTGATCT 


GGCTCGCCGT 


2640 


TCTGGTCAAA 


TCGCCAGCCT 


GTCAGCCCGT GTTGTCCGTG 


AAGGTGACGA 


GTTTAGCTTC 


2700 


GAATTTGGCC 


TTGATGAAAA 








2760 


ACCCACGTCT 


ATGCTGTCGC 


AAGACTGAAA GACGGAGGTA 


CTCAGTTTGA 


AGTTATGACG 


2820 


CGCAAACAGA 


TTGAGCTGGT 


GCGCAGCCTG AGTAAAGCTG 


GTAATAACGG 


GCCGTGGGTA 


2880 


ACTCACTGGG 


AAGAAATGGC 


ATVAGAAAACG GCTATTCGTC 


GCCTGTTCAA 


ATATTTGCCC 


2940 


GTATCAATTG 


AGATCCAGCG 


TGCAGTATCA ATGGATGAAA 


AGGAACCACT 


GACAATCGAT 


3000 


CCTGCAGATT 


CCTCTGTATT 


AACCX3GGGAA TACAGTGTAA 


TCGATAATTC 


AGAGGAATAG 


3060 
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ATCTAAGCTT CCTGCTGAAC 


ATPA A ArtT^PA 
AX WiAAoVj^A 


AoAAAAuAXL. X\j i X o 1 LAAA 




3120 


TTGAACAAGG ACAATTAAPA 


oX XAAV..AAAX 


TV A TV TV Tvr'/^r^TV TV TV TV <^ TV A TV TV T*/^ 

AAAAALGCAA AAt»AAAAxGC 


C GATAx C CTA 


3180 




•yPRT/^TV TV^TVrP 

X X AX LAAvJax 


TV TV TV ^/'llll^ TV TV ^r^t^ TV rnTV 11^ 

AAAGGTGAAT CCCATACCTC 


M TV TV MM 

GAGCTTCACG 


3240 




ij LAAGGGCTG 


^«| 1% TV TV Ti TV Ti TV ^^^^^^ TV Tk ^1 Ti 

CTAAAAGGAA GCGGAACACG 


TAGAAAGCCA 


3300 




CCCCGGATGA 


ATGTCAGCTA CTGGGCTATC 


TGGACAAGGG 


3360 




AAG CAGGT AG 


CTTGCAGTGG GCTTACATGG 


CGATAGCTAG 


3420 


APTf^^tl/^PfSrST TT^PAT^'Zr'IlOa 


GCAAGCGAAC 


CGGAATTGCC AGCTGGGGCG 


CCCTCTGGTA 


3480 




GTAAACTGGA 


TGGCTTTCTT GCCGCCAAGG 


ATCTGATGGC 


3540 


nPArjnnnATP AAf^ATPTpaT 


CAAGAGACAG 


GATGAGGATC GTTTCGCATG 


GATATTAATA 


3600 


PTPriAA Ar'T^A 7i 'PO a 7i TV TV 


AAGCAxTCAC 


TAACCCCCTT TCCTGTTTTC 


CTAATCAGCC 


3660 




xTTCACAGCT 


ATTTCAGGAG TTCAGCCATG 


AACGCTTATT 


3720 


Af^ AT'TT'AJ^r'Ti T^^'P^T'P/^ TV O 
riv«>i.X X v.n^VjA X Wo X ^ 1 X oAvj 


GCTCAGAGCT 


GGGCGCGTCA CTACCAGCAG 


CTCGCCCGTG 


3780 


AAr^AfZAA AfiA nf2r'Ar2Aap'pr" 


GUAGACGACA 


TGGAAAAAGG CCTGCCCCAG 


CACCTGTTTG 


3840 


AATPRPTATn PATPriATPaT 


1 X\7L,AACGCC 


ACGGGGCCAG CATUU^AATCC 


«L fvifTTTfc 0t^%0m0mmmt^m 

ATTACCCGTG 


3900 


CCTTTOATnA Pf5ATnTTY2Ars 


X X XaJAGGAGC 


^^Tv n't/**^/^Tv ^Tv ^ ^Tv m^/v^^ 

GCATGGCAGA ACACATCCGG 


•VITV MTV fTVMMfnmM 

TACATGGTTG 


3960 


AAAPPATTfiP TPAPPAPPAn 


GTTGATATTG 


TV nVn^^TV ^/im «k MTV TV TV TV MM TV M 

ATTCAGAGGT ATA/iAACGAG 


m TV M TV TV M MMIflT^^ 

TAGAAGCTTG 


4020 


GPTGTTTTRrt Pfif2A»P^^Ar2An 


AAGATTTTCA 


Vf 11/ VTV mTV MTV M TV mm TV TV TV fWI^^ TV 

GCCTGATACA GATTAAATCA 


^^TV TV MMMTV ^^Tk TV 

GAACGCAGAA 


4060 


fSPfidTPTnAT A A A Ar'l^r'a AT 


TTGCCTGGCG 


GCAGTAGCGC GGTGGTCCCA 


CCTGACCCCA 


4140 


TfiPPfiAAPTn ar*iift/^TT»iv?v IV 


CGCCGTAGCG 


CCGATGGTAG TGTGGGGTCT 


CCCCATGCGA 


4200 




TCAAATAAAA 


CGAAAGGCTC AGTCGAAAGA 


CTGGGCCTTT 


4260 


l«OXXXX/\X^l 1\3XXT6TC 


GGTGAACGCT 


CTCCTGAGTA GGACAAATCC 


GCCGGGAGCG 


4320 


wixXXIaAAm TTGCGAAGCA 


ACGGCCCGGA 


GGGTGGCGGG CAGGACGCCC 


GCCATAAACT 


4380 


fSPPAft^lOATT' TVTVTVTwPTVTvr'r^TV 


^ 3V TV ^^f^f^ IV fn/^ 

GAAGGCCATC 


CTGACGGATG GCCTTTTTGC 


GTTTCTACAA 


4440 


MV^XUXXlXial 1 1 Ai xTTTCT 


TV TV TV rtl TV Ofv mvTK^ 


W ^ ^ fVl% #V*^^#*TTi fVl ^V^^^^^Wn^^m MT^^ 

A7UVTATGTAT CCGCTCATGA 


GACAATAACC 


4500 


PTYSA'PA A ATYS /■^T'P^JVTV'PTV TVP 

V*x\a/iXAM/il\» UlXCAATAAx 


IV fpm/^Tk TV IV TV TV ^ 

ATTGAAAAAG 


GAAGAGTATG AGTATTCAAC 


ATTTCCGTGT 


4560 


W\9V*V,»WX XAX X L>l»V.» X X X 1 X X\S 


CGGCATTTTG 


CCTTCCTGTT TTTGCTCACC 


CAGAAACGCT 


4620 


fSfSTfSA A Ars'TA 1VTV TV^Tkfv^rwT'^^ 
\3\aX<jAA/iijXA AA/UaATGUTG 


TV TV o TV m^Tv ^inifw 

AAGATCAGTT 


GGGTGCACGA GTGGGTTACA 


TCGAACTGGA 


4680 


TP'P^^A Ar^7vr5^ /^/^rfiTV Tv^TV*!'^^ 
X^XUAAUAUU UGTAACSATCC 


TTGAGAGTTT 


TCGCCCCGAA GAACGTTTTC 


CAATGATGAG 


4740 


CACTTTTAAA GTTCTGCTAT 


GTGGCGCGGT 


ATTATCCCGT GTTGACGCCG 


GGCAAGAGCA 


4800 


ACTCGGTCGC CGCATACACT 


ATTCTCAGAA 


TGACTTGGTT GAGTACTCAC 


CAGTCACAGA 


4860 


AAAGCATCTT ACGGATGGCA 


TGACAGTAAG 


AGAATTATGC AGTGCTGCCA 


TAACCATGAG 


4920 


TGATAACACT GCGGCCAACT 


TACTTCTGAC 


AACGATCGGA GGACCGAAGG 


AGCTAACCGC 


4980 


TTTTTTGCAC AACATGGGGG 


ATCATGTAAC 


TCGCCTTGAT CGTTGGGAAC 


CGGAGCTGAA 


5040 


TGAAGCCATA CCAAACGACG 


AGCGTGACAC 


CACXSATGCCT GTAGCAATGG 


CAACAACGTT 


5100 
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GCGCAAACTA TTAACTGGCG AACTACTTAC TCTAGCTTCC CGGCAACAAT TAATAGACTG 5160 

GATGGAGGCG GATAAAGTTG CAGGACCACT TCTGCGCTCG GCCCTTCCGG CTGGCTGGTT 5220 

TATTGCTGAT AAATCTGGAG CCGGTGAGCG TGGGTCTCGC GGTATCATTG CAGCACTGGG 5280 

GCCAGATGGT AAGCCCTCCC GTATCGTAGT TATCTACACG ACGGGGAGTC AGGCAACTAT 5340 

GGATGAACGA AATAGACAGA TCGCTGAGAT AGGTGCCTCA CTGATTAAGC ATTGGTAACT 5400 

GTCAGACCAA GTTTACTCAT ATATACTTTA GATTGATTTA CGCGCCCTGT AGCGGCGCAT 5460 

TAAGCGCGGC GGGTGTGGTG GTTACGCGCA GCGTGACCGC TACACTTGCC AGCGCCCTAG 5520 

CGCCCGCTCC TTTCGCTTTC TTCCCTTCCT TTCTCGCCAC GTTCGCCGGC TTTCCCCGTC 5580 

AAGCTCTAAA TCGGGGGCTC CCTTTAGGGT TCCGATTTAG TGCTTTACGG CACCTCGACC 5640 

CCAAAAAACT TGATTTGGGT GATGGTTCAC GTAGTGGGCC ATCGCCCTGA TAGACGGTTT 5700 

TTCGCCCTTT GACGTTGGAG TCCACGTTCT TTAATAGTGG ACTCTTGTTC CAAACTTGAA 5760 

CAACACTCAA CCCTATCTCG GGCTATTCTT TTGATTTATA AGGGATTTTG CCGATTTCGG 5820 

CCTATTGGTT AAAAAATGAG CTGATTTAAC AAAAATTTAA CGCGAATTTT AACAAAATAT 5880 

TAACGTTTAC AATTTAAAAG GATCTAGGTG AAGATCCTTT TTGATAATCT CATGACCAAA 5940 

ATCCCTTAAC GTGAGTTTTC GTTCCACTGA GCGTCAGACC CCGTAGAAAA GATCAAAGGA 6000 

TCTTCTTGAG ATCCTTTTTT TCTGCGCGTA ATCTGCTGCT TGCAAACAAA AAAACCACCG 6060 

CTACCAGCGG TGGTTTGTTT GCCGGATCAA GAGCTACCAA CTCTTTTTCC GAAGGTAACT 6120 

GGCTTCAGCA GAGCGCAGAT ACCAAATACT GTCCTTCTAG T6TAGCCGTA GTTAGGCCAC 6180 

CACTTCAAGA ACTCTGTAGC ACCGCCTACA TACCTCGCTC T6CTAATCCT GTTACCAGTG 6240 

GCTGCTGCCA GTGGCGATAA GTCGTGTCTT ACCGGGTTGG ACTCAAGACG ATAGTTACCG 6300 

GATAAGGCGC AGCGGTCGGG CTGAACGGGG GGTTCGTGCA CACAGCCCAG CTTGGAGCGA 6360 

ACGACCTACA CCGAACTGAG ATACCTACAG CGTGAGCTAT GAGAAAGCGC CACGCTTCCC 6420 

GAAGGGAGAA AGGCGGACAG GTATCCGGTA AGGG6CAGGG TCGGAACAGG AGAGCGCACG 6480 

AGGGAGCTTC CAGGGGGAAA CGCCTGGTAT CTTTATAGTC CTGTCGGGTT TCGCCACCTC 6540 

TGACTTGAGC GTCGATTTTT GTGATGCTCG TCAGGGGGGC GGAGCCTATG GAAAAACGCC 6600 

AGCAACGCGG CCTTTTTACQ GTTCCTGQCC TTTTGCTGGC CTTTTGCTCA CATGTTCTTT 6660 

CCTGCGTTAT CCCCTGATTC TGTGGATAAC CGTATTACCG CCTTTGAGTG AGCTGATACC 6720 

GCTCGCCGCA GCCGAACGAC CGAGCGCAGC GAGTCAGTGA GCGAGGAAGC GGAAGAGCGC 6780 

CTGATGCGGT ATTTTCTCCT TACGCATCTG TGCGGTATTT CACACCGCAT AGGGTCATGG 6840 

CTGCGCCCCG ACACCCGCCA ACACCCGCTG ACGCGCCCTG ACGGGCTTGT CTGCTCCCGG 6900 

CATCCGCTTA CAGACAAGCT GTGACCGTCT CCGGGAGCTG CATGTGTCAG AGGTTTTCAC 6960 

CGTCATCACC GAAACGCGCG AGGCAGCAAG GAGATGGCGC CCAACAGTCC CCCGGCCACG 7020 

GGGCCTGCCA CCATACCCAC GCCX3AAACAA GCGCTCATGA GCCCGAAGTG GCGAGCCCGA 7080 

TCTTCCCCAT CGGTGATGTC GGCGATATAG GCGCCAGCAA CCGCACCTGT GGCGCCGGTG 7140 
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ATGCCGGCCA CGATGCGTCC GGCGTAGAGG ATCTGCTCAT GTTTGACAGC TTATC 7195 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7010 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE: 

(B) CLONE: pBAD- alpha -beta -gamma 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 1320. .2000 

(D) OTHER INFORMATION: /product = "red alpha" 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION:2086. .2871 

(D) OTHER INFORMATION: /product = "red beta" 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 3403. .3819 

(D) OTHER INFORMATION: /product = "red gamma" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

ATCGATGCAT AATGTGCCTG TCAAATGGAC GAAGCAGGGA TTCTGCAAAC CCTATGCTAC 60 

TCCGTCAAGC CGTCAATTGT CTGATTCGTT ACCAATTATG ACAACTTGAC GGCTACATCA 120 

TTCACTTTTT CTTCACAACC GGCACGGAAC TCGCTCGGGC TGGCCCCGGT GCATTTTTTA 180 

AATACCCGCG AGAAATAGAG TTGATCGTCA AAACCAACAT TGCGACCGAC GGTGGCGATA 240 

GGCATCCGGG TGGTGCTCAA AAGCAGCTTC GCCTGGCTGA TACGTTGGTC CTCGCGCCAG 300 

CTTAAGACGC TAATCCCTAA CTGCTGGCGG AAAAGATGTG ACAGACGCGA CGGCGACAAG 360 

CAT^CATGCT GTGCGACGCT GGCGATATCA AAATTGCTGT CTGCCAGGTG ATCGCTGATG 420 

TACTGACAAG CCTCGCGTAC CCGATTATCC ATCGGTGGAT GGAGCGACTC GTTAATCGCT 480 

TCCATGCGCC GCAGTAACAA TTGCTCAAGC AGATTTATCG CCAGCAGCTC CGAATAGCGC 540 

CCTTCCCCTT GCCCGGCGTT AATGATTTGC CCAAACAGGT CGCTGAAATG CGGCTGGTGC 600 

GCTTCATCCG GGCGAAAGAA CCCCGTATTG GCAAATATTG ACGGCCAGTT AAGCCATTCA 660 

TGCCAGTAGG CGCGCGGACG AAAGTAAACC CACTGGTGAT ACCATTCGCG AGCCTCCGGA 720 

TGACGACCGT AGTGATGAAT CTCTCCTGGC GGGAACAGCA AAATATCACC CGGTCGGCAA 780 

ACAAATTCTC GTCCCTGATT TTTCACCACC CCCTGACCGC GAATGGTGAG ATTGAGAATA 840 

TAACCTTTCA TTCCCAGCGG TCGGTCGATA AAAAAATCGA GATAACCGTT GGCCTCAATC 900 

GGCGTTAAAC CCGCCACCAG ATGGGCATTA AACGAGTATC CCGGCAGCAG GGGATCATTT 960 
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TGCGCTTCAG CCATACTTTT CATACTCCCG CCATTCAGAG AAGAAACCAA TTGTCCATAT 1020 

TGCATCAGAC ATTGCCGTCA CTGCGTCTTT TACTGGCTCT TCTCGCTAAC CAAACCGGTA 1080 

ACCCCGCTTA TTAAAAGCAT TCTGTAACAA AGCGGGACCA AAGCCATGAC AAAAACGCGT 1140 

AACAAAAGTG TCTATAATCA CGGCAGAAAA GTCCACATTG ATTATTTGCA CGGCGTCACA 1200 

CTTTGCTATG CCATA6CATT TTTATCCATA AGATTAGCGG ATCCTACCTG ACGCTTTTTA 1260 

TCGCAACTCT CTACTGTTTC TCCATACCCG TTTTTTTGGG CTAGCAGGAG GAATTCACC 1319 

ATG ACA CCG GAG ATT ATC CTG GAG CGT AGO GGG ATC GAT GTG AGA GOT 1367 
Met Thr Pro Asp He He Leu Gin Arg Thr Gly He Asp Val Arg Ala 
290 295 300 

GTC GAA GAG GGG GAT GAT GCG TGG CAC AAA TTA CGG CTC GGC GTC ATC 1415 
Val Glu Gin Gly Asp Asp Ala Trp His Lys Leu Arg Leu Gly Val He 
305 310 315 

ACC OCT TCA GAA GTT CAC AAC GTG ATA GCA AAA CCC CGC TCC GGA AAG 1463 
Thr Ala Ser Glu Val His Asn Val He Ala Lys Pro Arg Ser Gly Lys 
320 325 330 335 



AAG TGG CCT GAC ATG AAA ATG TCC TAG TTC CAC ACC CTG CTT GCT GAG 1511 
Lys Trp Pro Asp Met Lys Met Ser Tyr Phe His Thr Leu Leu Ala Glu 
340 345 350 

GTT TGC ACC GGT GTG GCT CCG GAA GTT AAC GCT AAA GCA CTG GCC TGG 1559 
Val Cys Thr Gly Val Ala Pro Glu Val Asn Ala Lys Ala Leu Ala Trp 
355 360 365 

GGA AAA GAG TAG GAG AAC GAC GCC AGA ACC CTG TTT GAA TTC ACT TCC 1607 
Gly Lys Gin Tyr Glu Asn Asp Ala Arg Thr Leu Phe Glu Phe Thr Ser 
370 375 380 

GGC GTG AAT GTT ACT GAA TCC CCG ATC ATC TAT CGC GAC GAA AGT ATG 1655 
Gly Val Asn Val Thr Glu Ser Pro He lie Tyr Arg Asp Glu Ser Met 
385 390 395 

CGT ACC GCC TGC TCT CCC GAT GGT TTA TGC AGT GAC GGC AAC GGC CTT 1703 
Arg Thr Ala Cys Ser Pro Asp Gly Leu Cys Ser Asp Gly Asn Gly Leu 
400 405 410 415 

GAA CTG AAA TGC CCG TTT ACC TCC CGG GAT TTC ATG AAG TTC CGG CTC 1751 
Glu Leu Lys Cys Pro Phe Thr Ser Arg Asp Phe Met Lys Phe Arg Leu 
420 425 430 

GGT GGT TTC GAG GCC ATA AAG TCA GCT TAG ATG GCC CAG GTG CAG TAG 1799 
Gly Gly Phe Glu Ala He Lys Ser Ala Tyr Met Ala Gin Val Gin Tyr 
435 440 445 

AGC ATG TGG GTG ACG CGA AAA AAT GCC TGG TAG TTT GCC 7VAC TAT GAC 1847 
Ser Met Trp Val Thr Arg Lys Asn Ala Trp Tyr Phe Ala Asn Tyr Asp 
450 455 460 

CCG CGT ATG AAG CGT GAA GGC CTG CAT TAT GTC GTG ATT GAG CGG GAT 1895 
Pro Arg Met Lys Arg Glu Gly Leu His Tyr Val Val He Glu Arg Asp 
465 470 475 

GAA AAG TAG ATG GCG AGT TTT GAC GAG ATC GTG CCG GAG TTC ATC GAA 1943 
Glu Lys Tyr Met Ala Ser Phe Asp Glu He Val Pro Glu Phe He Glu 
480 485 490 495 

AAA ATG GAC GAG GCA CTG GCT GAA ATT GGT TTT GTA TTT GGG GAG CAA 1991 
Lys Met Asp Glu Ala Leu Ala Glu He Gly Phe Val Phe Gly Glu Gin 
500 505 510 
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TGG CGA TAG ATCCGGTACC CGAGCACGTG TTGACAATTA ATCATCGGCA 2040 
Trp Arg * 

TAGTATATCG GCATAGTATA ATACGACAAG GTGAGGAACT AAACC ATG AGT ACT 2094 

Met Ser Thr 
1 

GCA CTC GCA ACG CTG GCT GGG AAG CTG GCT GAA CGT GTC GGC ATG GAT 2142 
Ala Leu Ala Thr Leu Ala Gly Lys Leu Ala Glu Arg Val Gly Met Asp 
5 10 15 

TCT GTC GAC CCA CAG GAA CTG ATC ACC ACT CTT CGC CAG ACG GCA TTT 2190 
Ser Val Asp Pro Gin Glu Leu He Thr Thr Leu Arg Gin Thr Ala Phe 
20 25 30 35 

AAA GGT GAT GCC AGC GAT GCG CAG TTC ATC GCA TTA CTG ATC GTT GCC 2238 
Lys Gly Asp Ala Ser Asp Ala Gin Phe He Ala Leu Leu He Val Ala 
40 45 50 

AAC CAG TAC GGC CTT AAT CCG TGG ACG AAA GAA ATT TAC GCC TTT CCT 2286 
Asn Gin Tyr Gly Leu Asn Pro Trp Thr Lys Glu He Tyr Ala Phe Pro 
55 60 65 

GAT AAG CAG AAT GGC ATC GTT CCG GTG GTG GGC GTT GAT GGC TGG TCC 2334 
Asp Lys Gin Asn Gly He Val Pro Val Val Gly Val Asp Gly Trp Ser 
70 75 80 

CGC ATC ATC AAT GAA AAC CAG CAG TTT GAT GGC ATG GAC TTT GAG CAG 2382 
Arg He He Asn Glu Asn Gin Gin Phe Asp Gly Met Asp Phe Glu Gin 
85 90 95 

GAC AAT GAA TCC TGT ACA TGC CGG ATT TAC CGC AAG GAC CGT AAT CAT 2430 
Asp Asn Glu Ser Cys Thr Cys Arg He Tyr TVrg Lys Asp Arg Asn His 

105 110 115 

CCG ATC TGC GTT ACC GAA TGG ATG GAT GAA TGC CGC CGC GAA CCA TTC 2478 
Pro He Cys Val Thr Glu Trp Met Asp Glu Cys Arg Arg Glu Pro Phe 
120 125 130 

AAA ACT CGC GAA GGC AGA GAA ATC ACG GGG CCG TGG CAG TCG CAT CCC 2526 
Lys Thr Arg Glu Gly Arg Glu He Thr Gly Pro Trp Gin Ser His Pro 
135 140 145 

AAA CGG ATG TTA CGT CAT AAA GCC ATG ATT CAG TGT GCC CGT CTG GCC 2574 
Lys Arg Met Leu Arg His Lys Ala Met He Gin Cys Ala Arg Leu Ala 
150 155 160 

TTC GGA TTT GCT GGT ATC TAT GAC AAG GAT GAA GCC GAG CGC ATT GTC 2622 
Phe Gly Phe Ala Gly He Tyr Asp Lys Asp Glu Ala Glu Arg He Val 
165 170 175 

GAA AAT ACT GCA TAC ACT GCA GAA CGT CAG CCG GAA CGC GAC ATC ACT 2670 
Glu Asn Thr Ala Tyr Thr Ala Glu Arg Gin Pro Glu Arg Asp He Thr 
180 185 190 195 

CCG GTT AAC GAT GAA ACC ATG CAG GAG ATT AAC ACT CTG CTG ATC GCC 2718 
Pro Val Asn Asp Glu Thr Met Gin Glu He Asn Thr Leu Leu He Ala 
200 205 210 

CTG GAT AAA ACA TGG GAT GAC GAC TTA TTG CCG CTC TGT TCC CAG ATA 2766 
Leu Asp Lys Thr Trp Asp Asp Asp Leu Leu Pro Leu Cys Ser Gin He 
215 220 225 

TTT CGC CGC GAC ATT CGT GCA TCG TCA GAA CTG ACA CAG GCC GAA GCA 2814 
Phe Arg Arg Asp He Arg Ala Ser Ser Glu Leu Thr Gin Ala Glu Ala 
230 235 240 
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GTA AAA GCT CTT GGA TTC CTG AAA CAG AAA GCC GCA GAG GAG AAG GTG 2862 
Val Lys Ala Leu Gly Phe Leu Lys Gin Lys Ala Ala Glu Gin Lys Val 
245 250 255 

GCA GCA TAG ATCTCGA6AA GCTTCCTGCT GAACATCAAA GGCAAGAAAA 2911 

Ala Ala * 

260 

CATCTGTTGT CAAAGACAGC ATCCTTGAAC AAGGACAATT AACAGTTAAC AAATAAAAAC 2971 

GCAAAAGAAA ATGCCGATAT CCTATTGGCA TTTTCTTTTA TTTCTTATCA ACATAAAGGT 3031 

GAATCCCATA CCTCGAGCTT CACGCTGCCG CAAGCACTCA GGGCGCAAGG GCTGCTAAAA 3091 

GGAAGCGGAA CACGTAGAAA GCCAGTCCGC AGAAACGGTG CTGACCCCGG ATGAATGTCA 3151 

GCTACTGGGC TATCTGGACA AGGGAAAACG CAAGCGCAAA GAGAAAGCAG GTAGCTTGCA 3211 

GTGGGCTTAC ATGGCGATAG CTAGACTGGG CGGTTTTATG GACAGCAAGC GAACCGGAAT 3271 

TGCCAGCTGG GGCGCCCTCT GGTAAGGTTG GGAAGCCCTG CAAAGTAAAC TGGATGGCTT 3331 

TCTTGCCGCC AAGGATCTGA TGGCGCAGGG GATCAAGATC TGATCAAGAG ACAGGATGAG 3391 

GATCGTTTCG C ATG GAT ATT AAT ACT GAA ACT GAG ATC AAG CAA AAG CAT 3441 
Met Asp lie Asn Thr Glu Thr Glu He Lys Gin Lys His 
15 10 

TCA CTA ACC CCC TTT CCT GTT TTC CTA ATC AGC CCG GCA TTT CGC GGG 3489 
Ser Leu Thr Pro Phe Pro Val Phe Leu He Ser Pro Ala Phe Arg Gly 
15 20 . 25 

CGA TAT TTT CAC AGC TAT TTC AGG AGT TCA GCC ATG AAC GCT TAT TAC 3537 
Arg Tyr Phe His Ser Tyr Phe Arg Ser Ser Ala Met Asn Ala Tyr Tyr 
30 35 40 45 

ATT CAG GAT CGT CTT GAG GCT CAG AGC TGG GCG CGT CAC TAC CAG CAG 3585 
He Gin Asp Arg Leu Glu Ala Gin Ser Trp Ala Arg His Tyr Gin Gin 
50 55 60 

CTC GCC CGT GAA GAG AAA GAG GCA GAA CTG GCA GAC GAC ATG GAA AAA 3633 
Leu Ala Arg Glu Glu Lys Glu Ala Glu Leu Ala Asp Asp Met Glu Lys 
65 70 75 

GGC CTG CCC CAG CAC CTG TTT GAA TCG CTA TGC ATC GAT CAT TTG CAA 3681 
Gly Leu Pro Gin His Leu Phe Glu Ser Leu Cys He Asp His Leu Gin 
80 85 90 

CGC CAC GGG GCC AGC AAA AAA TCC ATT ACC CGT GCG TTT GAT GAC GAT 3729 
Arg His Gly Ala Ser Lys Lys Ser He Thr Arg Ala Phe Asp Asp Asp 
95 100 105 

GTT GAG TTT CAG GAG CGC ATG GCA GAA CAC ATC CGG TAC ATG GTT GAA 3777 
Val Glu Phe Gin Glu Arg Met Ala Glu His He Arg Tyr Met Val Glu 

115 120 125 

ACC ATT GCT CAC CAC CAG GTT GAT ATT GAT TCA GAG GTA TAA 3819 
Thr He Ala His His Gin Val Asp He Asp Ser Glu Val * 
130 135 

AACGAGTAGA AGCTTGGCTG TTTTGGCGGA TGAGAGAAGA TTTTCAGCCT GATACAGATT 3879 

AAATCAGAAC GCAGAAGCGG TCTGATAAAA CAGAATTTGC CTGGCGGCAG TAGCGCGGTG 3939 

GTCCCACCTG ACCCCATGCC GAACTCAGAA GTGAAACGCC GTAGCGCCGA TGGTAGTGTG 3999 

GGGTCTCCCC ATGCGAGAGT AGGGAACTGC CAGGCATCAA ATAAAACGAA AGGCTCAGTC 4059 



wo 99/29837 



22 



PCT/EP98/0794S 



GAAAGACTGG 


GCCTTTCGTT 


TTATCTGTTG 


TTTGTCGGTG AACGCTCTCC 


TGAGTAGGAC 


4119 


AAATCCGCCX5 


GGAGCX5GATT 


TGAACGTTGC 


GAAGCAACGG CCCGGAGGGT 


GGCGGGCAGG 


4179 


ACGCCCGCCA 


TAAACTGCCA 


GGCATCAAAT 


TAAGCAGAAG GCCATCCTGA 


CGGATGGCCT 


4239 


TTTTGCGTTT 


CTACAAACTC 


TTTTGTTTAT 


TTTTCTAAAT ACATTCAAAT 


ATGTATCCGC 


4299 


TCATGAGACA 


ATAACCCTGA 


TAAATGCTTC 


AATAATATTG AAAAAGGAAG 


AGTATGAGTA 


4359 


TTCAACATTT 


CCGTGTCGCC 


CTTATTCCCT 


TTTTTGCGGC ATTTTGCCTT 


CCTGTTTTTG 


4419 


CTCACCCAGA 


AACGCTGGTG 


AAAGTAAAAG 


ATGCTGAAGA TCAGTTGGGT 


GCACGAGTGG 


4479 


GTTACATCGA 


ACTGGATCTC 


AACAGCGGTA 


AGATCCTTGA GAGTTTTCGC 


CCCGAAGAAC 


4539 


GTTTTCCAAT 


GATGAGCACT 


TTTAAAGTTC 


TGCTATGTGG CGCGGTATTA 


TCCCGTGTTG 


4599 


ACGCCGGGCA 


AGAGCAACTC 


GGTCGCCGCA 


TACACTATTC TCAGAATGAC 


TTGGTTGAGT 


4659 


ACTCACCAGT 


CACAGAAAAG 


CATCTTACGG 


ATGGCATGAC AGTAAGAGAA 


TTATGCAGTG 


4719 


CTGCCATAAC 


CATGAGTGAT 


AACACTGCGG 


CCAACTTACT TCTGACTACG 


ATCGGAGGAC 


4779 


CGAAGGAGCT 


AACCGCTTTT 


TTGCACAACA 


TGGGGGATCA TGTAACTCGC 


CTTGATCGTT 


4839 


GGGAACCGGA 


GCTGAATGAA 


GCCATACCAA 


ACGACGAGCG TGACACCACG 


ATGCCTGTAG 


4899 


CAATGGCAAC 


AACGTTGCGC 


AAACTATTAA 


CTGGCGAACT ACTTACTCTA 


GCTTCCCGGC 


4959 


AACAATTAAT 


AGACTGGATG 


GAGGCGGATA 


AAGTTGCAGG ACCACTTCTG 


CGCTCGGCCC 


5019 


TTCCGGCTGG 


CTGGTTTATT 


GCTGATAAAT 


CTGGAGCCGG TGAGCGTGGG 


TCTCGCGGTA 


5079 


TCATTGCAGC 


ACTGGGGCCA 


GATGGTAAGC 


CCTCCCGTAT CGTAGTTATC 


TACACGACGG 


5139 


GGAGTCAGGC 


AACTATGGAT 


GAACGAAATA 


GACAGATCGC TGAGATAGGT 


GCCTCACTGA 


5199 


TTAAGCATTG 


GTAACTGTCA 


GACCAAGTTT 


ACTCATATAT ACTTTAGATT 


GATTTACGCG 


5259 


CCCTGTAGCG 


GCGCATTAAG 


CGCGGCGGGT 


GTGGTGGTTA CGCGCAGCGT 


GACCGCTACA 


5319 


CTTGCCAGCG 


CCCTAGCGCC 


CGCTCCTTTC 


GCTTTCTTCC CTTCCTTTCT 


CGCCACGTTC 


5379 


GCCGGCTTTC 


CCCGTCAAGC 


TCTAAATCGG 


GGGCTCCCTT TAGGGTTCCG 


ATTTAGTGCT 


5439 


TTACGGCACC 


TCGACCCCAA 


AAAACTTGAT 


TTGGGTGATG GTTCACGTAG 


TGGGCCATCG 


5499 


CCCTGATAGA 


CGGTTTTTCG 


CCCTTTGACG 


TTGGAGTCCA CGTTCTTTAA 


TAGTGGACTC 


5559 


TTGTTCCAAA 


CTTGAACAAC 


ACTCAACCCT 


ATCTC6GGCT ATTCTTTTGA 


TTTATAAGGG 


5619 


ATTTTGCCGA 


TTTCGGCCTA 


TTGGTTAAAA 


AATGAGCTGA TTTAACAAAA 


ATTTAACGCG 


5679 


AATTTTAACA 


AAATATTAAC 


GTTTACAATT 


TAAAAGGATC TAGGTGAAGA 


TCCTTTTTGA 


5739 


TAATCTCATG 


ACCAAAATCC 




^|fp^rtfwifp^i^^^p^rt^^ ^^^^^^ ^ ^^^^^Wff 

\7 1 1 1 1 y^SaX I C CAUTKaAUCuT 


CAQACCCCGT 


5799 


AGAAAAGATC 


AAAGGATCTT 


CTTGAGATCC 


TTTTTTTCTG CGCGTAATCT 


GCTGCTTGCA 


5859 


AACAAAAAAA 


CCACCGCTAC 


CAGCGGTGGT 


TTGTTTGCCG GATCAAGAGC 


TACCAACTCT 


5919 


TTTTCCGAAG 


GTAACTGGCT 


TCAGCAGAGC 


GCAGATACCA AATACTGTCC 


TTCTAGTGTA 


5979 


GCCGTAGTTA 


GGCCACCACT 


TCAAGAACTC 


TGTAGCACCG CCTACATACC 


TCGCTCTGCT 


6039 


AATCCTGTTA 


CCAGTGGCTG 


CTOCCAGTGG 


CGATAAGTCG TGTCTTACCG 


GGTTGGACTC 


6099 
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AAGACGATAG TTACCGGATA AGGCGCAGCG GTCGGGCTGA ACGGGGGGTT CGTGCACACA 6159 

GCCCAGCTTG GAGCGAACGA CCTACACCGA ACTGAGATAC CTACAGCGTG AGCTATGAGA 6219 

AAGCGCCACG CTTCCCGAAG GGAGAAAGGC GGACAGGTAT CCGGTAAGCG GCAGGGTCGG 6279 

AACAGGAGAG CGCACGAGGG AGCTTCCAGG GGGAAACGCC TGGTATCTTT ATAGTCCTGT 6339 

CGGGTTTCGC CACCTCTGAC TTGAGCGTCG ATTTTTGTGA TGCTCGTCAG GGGGGCGGAG 6399 

CCTATGGAAA AACGCCAGCA ACGCGGCCTT TTTACGGTTC CTGGCCTTTT GCTGGCCTTT 6459 

TGCTCACATG TTCTTTCCTG CGTTATCCCC TGATTCTGTG GATAACCGTA TTACCGCCTT 6519 

TGAGTGAGCT GATACCGCTC GCCGCAGCCG AACGACCGAG CGCAGCGAGT CAGTGAGCGA 6579 

GGAAGCGGAA GAGCGCCTGA TGCGGTATTT TCTCCTTACG CATCTGTGCG GTATTTCACA 6639 

CCGCATAGGG TCATGGCTGC GCCCCGACAC CCGCCAACAC CCGCTGACGC GCCCTGACGG 6699 

GCTTGTCTGC TCCCGGCATC CGCTTACAGA CAAGCTGTGA CCGTCTCCGG GAGCTGCATG 6759 

TGTCAGAGGT TTTCACCGTC ATCACCGAAA CGCGCGAGGC AGCAAGGAGA TGGCGCCCAA 6819 

CAGTCCCCCG GCCACGGGGC CTGCCACCAT ACCCACGCCG AAACAAGCGC TCATGAGCCC 6879 

GAAGTGGCGA GCCCGATCTT CCCCATCGGT GATGTCGGCG ATATAGGCGC CAGCAACCGC 6939 

ACCTGTGGCG CCGGTGATGC CGGCCACGAT GCGTCCGGCG TAGAGGATCT GCTCATGTTT 6999 
GACAGCTTAT C 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Thr Pro Asp He He Leu Gin Arg Thr Gly He Asp Val Arg Ala 
^ 5 10 15 

Val Glu Gin Gly Asp Asp Ala Trp His Lys Leu Arg Leu Gly Val He 
20 25 30 

Thr Ala Ser Glu Val His Asn Val He Ala Lys Pro Arg Ser Gly Lys 
35 40 45 

Lys Trp Pro Asp Met Lys Met Ser Tyr Phe His Thr Leu Leu Ala Glu 
50 55 60 

Val Cys Thr Gly Val Ala Pro Glu Val Asn Ala Lys Ala Leu Ala Trp 
^5 70 75 80 

Gly Lys Gin Tyr Glu Asn Asp Ala Arg Thr Leu Phe Glu Phe Thr Ser 
85 90 95 

Gly Val Asn Val Thr Glu Ser Pro He He Tyr Arg Asp Glu Ser Met 
100 105 110 

Arg Thr Ala Cys Ser Pro Asp Gly Leu Cys Ser Asp Gly Asn Gly Leu 
115 120 125 
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Glu Leu Lys Cys Pro Phe Thr Ser Arg Asp Phe Met Lys Phe Arg Leu 
130 135 140 

Gly Gly Phe Glu Ala lie Lys Ser Ala Tyr Met Ala Gin Val Gin Tyr 
145 150 155 160 

Ser Met Trp Val Thr Arg Lys Asn Ala Trp Tyr Phe Ala Asn Tyr Asp 
165 170 175 

Pro Arg Met Lys Arg Glu Gly Leu His Tyr Val Val lie Glu Arg Asp 
180 185 190 

Glu Lys Tyr Met Ala Ser Phe Asp Glu He Val Pro Glu Phe He Glu 
195 200 205 

Lys Met Asp Glu Ala Leu Ala Glu He Gly Phe Val Phe Gly Glu Gin 
210 215 220 

Trp Arg * 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 262 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Ser Thr Ala Leu Ala Thr Leu Ala Gly Lys Leu Ala Glu Arg Val 
15 10 15 

Gly Met Asp Ser Val Asp Pro Gin Glu Leu He Thr Thr Leu Arg Gin 
20 25 30 

Thr Ala Phe Lys Gly Asp Ala Ser Asp Ala Gin Phe He Ala Leu Leu 
35 40 45 

He Val Ala Asn Gin Tyr Gly Leu Asn Pro Trp Thr Lys Glu He Tyr 
50 55 . 60 

Ala Phe Pro Asp Lys Gin Asn Gly He Val Pro Val Val Gly Val Asp 
65 70 75 80 

Gly Trp Ser Arg He He Asn Glu Asn Gin Gin Phe Asp Gly Met Asp 
85 90 95 

Phe Glu Gin Asp Asn Glu Ser Cys Thr Cys Arg He Tyr Arg Lys Asp 
100 105 110 

Arg Asn His Pro He Cys Val Thr Glu Trp Met Asp Glu Cys Arg Arg 
115 120 125 

Glu Pro Phe Lys Thr Arg Glu Gly Arg Glu He Thr Gly Pro Trp Gin 
130 135 140 

Ser His Pro Lys Arg Met Leu Arg His Lys Ala Met He Gin Cys Ala 
145 150 155 160 

Arg Leu Ala Phe Gly Phe Ala Gly He Tyr Asp Lys Asp Glu Ala Glu 



225 



165 



170 



175 
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Arg He Val Glu Asn Thr Ala Tyr Thr Ala Glu Arg Gin Pro Glu Arg 
180 185 190 

Asp He Thr Pro Val Asn Asp Glu Thr Met Gin Glu He Asn Thr Leu 
195 200 205 

Leu He Ala Leu Asp Lys Thr Trp Asp Asp Asp Leu Leu Pro Leu Cys 
210 215 220 

Ser Gin He Phe Arg Arg Asp He Arg Ala Ser Ser Glu Leu Thr Gin 
225 230 235 240 

Ala Glu Ala Val Lys Ala Leu Gly Phe Leu Lys Gin Lys Ala Ala Glu 
245 250 255 

Gin Lys Val Ala Ala * 
260 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asp He Asn Thr Glu Thr Glu He Lys Gin Lys His Ser Leu Thr 
^5 10 15 

Pro Phe Pro Val Phe Leu He Ser Pro Ala Phe Arg Gly Arg Tyr Phe 
20 25 30 

His Ser Tyr Phe Arg Ser Ser Ala Met Asn Ala Tyr Tyr He Gin Asp 
35 40 45 

Arg Leu Glu Ala Gin Ser Trp Ala Arg His Tyr Gin Gin Leu Ala Arg 
50 55 60 

Glu Glu Lys Glu Ala Glu Leu Ala Asp Asp Met Glu Lys Gly Leu Pro 
^5 70 75 80 

Gin His Leu Phe Glu Ser Leu Cys He Asp His Leu Gin Arg His Gly 
85 90 95 

Ala Ser Lys Lys Ser He Thr Arg Ala Phe Asp Asp Asp Val Glu Phe 
100 105 110 

Gin Glu tog Met Ala Glu His He Arg Tyr Met Val Glu Thr He Ala 
H5 120 125 

His His Gin Val Asp He Asp Ser Glu Val * 
130 135 
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Table 1 : Sequences of Oligos for PGR 
Figure 3ab 

left: TGACCCCTCACAAGGAGACGACCTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 
right: TACAAATGTGGTATGGCTGATTATGATCCTCTAGAGTCGGTGCTCACTGCCCGCTTTCCA 
template: pJP5603 

targeting vector: pSV-pazll 
Figure 3c 

a-left:CTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 

a-right: ATGATCCTCTAGAGTCGGTGCTCACTGCCCGCTTTCCA 

b-left: AGACGACCTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 

b-right: GCTGATTATGATCCTCTAGAGTCGGTGCTCACTGCCCGCTTTCCA 

c-left: CACAAGGAGACGACCTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 

c-right: TGGTATGGCTGATTATGATCCTCTAGAGTCGGTGCTCACTGCCCGCTTTCCA 

d-left:TGACCCCTCACAAGGAGACGACCTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 

d-right:TACAAATGTGGTATGGCTGATTATGATCCTCTAGAGTCGGTGCTCACTGCCCGCTTTCCA 
e-Ieft: 

CACGCCCCTGACCCCTCACAAGGAGACGACCTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 
e-right: 

TAAAACCTCTACAAATGTGGTATGGCTGATTATGATCCTCTAGAGTCGGTGCTCACTGCCCGCm 
f-left: 

TCCCCTGACCCACGCCCCTGACCCCTCACAAGGAGACGACCTTCCATGACCGAGTACAAGAGGGATGT 
AACGCACTGA 

f-right: 

TAAAGCAAGTAAAACCTCTACAAATGTGGTATGGCTGATTATGATCCTCTAGAGTCGGTGCTCACTGCC 
CGCTTTCCA 

template: pJP5603 

targeting vector: pSV-pazl 1 

Figure 3d 
a-left: 

TCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATCAAGGGCTGCTAAAGGAA 
a-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCA 
b-left: 

CACGAGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGCAAGGGCTGCTAAAGGAA 
b-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
c-left: 

TTAACCGTCACGAGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCACAAGGGCTGCTAAAGGAA 
c-right 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
d-left: 

TGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGACAAGGGCTGCTAAAGGAA 
d-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
e-left: 

TCTCTATCGTGCGGTGGTTGAACTGCACACCGCCGACGGCACGCTGATTCAAGGGCTGCTAAAGGAA 
e-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAG 
f-left: 

TGGAGTGACGGCAGTTATCTGGAAGATCAGGATATGTGGCGGATGAGCGCAAGGGCTGCTAAAGGAA 
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f-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCrrCATCAGCAGGATGGCGAAGAACTCCAGCAT 
g-left: 

TGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCAAGGGCTGCTAAAGGAA 
g-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
h-left: 

TGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCAAGGGCTGCTAAA 
h-right: 

TATTTTTGACACCAGACCAACTGGTAATGGTAGCGACCGGCGCTCAGCTGGCGAAGAACTCCAGCAT 
template: pJP5603 

targeting vector: pSV-pazl 1 

Figure 4 
left: 

TCAAGTGAGAAATCACCATGAGTGACGACTGAATCCGGTGAGAATGGCAAAAGCTTATGCCCACCAGC 

TGGTATGGCTGATTATGATC 

right: 

TCCAACATGGATGCTGATTTATATGGGTATAAATGGGCTCGCGATAATGTCGGGCAATCAGGTGCGACA 
ATCTACCACCAGCTCTTTTCTACGGGGTCTGACGC 
template: pBR322 
targeting vector: Hoxa-Pl 

Figures 
left: 

TGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTTAATACGACTCACTATAGGGAGAACA 

GGAAACAGCTATGCCCATAACACCCAGAGTA 

right: 

TGCGCCGCTACAGGGCGCGTCCATTCGCCATTCAGGCCTGACTCACTAGTGATGGTGATGGTGATGTGG 
GGGGTGCCGCTCAGT 

template: pmtrx (a pBIuescipt vector carrying mouse trithorax cDN A) 
targeting vector: pZero2. 1 

Figure 6 
left: 

TGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGGAGAAAAAAATCACT 

GGATATACCACCG 

right: 

TACAGGGCGCGTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACGCCCCGCCCTGC 
CACTCATCGCA 

template: pMAK705 

targeting vector: pBAD-24 backbone Amp resistant gene 

Figure 8 

i: 

TGCCAAGCTTGACCCACTGTGGAAGTGTTCCAAAAAGCGGGAAGGCTCTTGAGCTACTTCACTAACAAC 
CGG 

g: 

TCACCATCTTCGGGCCATTTGTAGACTGGAATATTTCGAGCTATGAGTGTGCTACTO 
G 

h: 

TGGCCCCAGGGTGACGCGGACATGGAGTTGTCGCCAGGGCACTGGTCCATGAGAGTGCCAAGCTACTC 
GCGAC 

template: pKaZ 

targeting vector: Hoxa-PI 
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Figure 9 
j- 

TAATAGCGAAGAGGCCCGCACCGATCGCCCITCCCAACAGTTCCGCAGCCTGAATGGCGAATGGCGC^ 

TTGCCTGGTTTATAACnTCGTATAGCATACATTATACGAAGTTATGGGCTGCTAAAGGAAGCGGAACAC 
G 

k: 

TGGCAGTTCAGGCCAATCCGCGCCGGATGCGGTGTATCGCTCGCCACTTCAACATCAACGGTAATCGCC 

ATTTGACCATATAACTTCGTATAATGTATGCTATACGAAGTTATCCCCAGAGTCCCGCTCAGAAGAACl' 
template: pJP5603 

tai;geting vector JC9604 chromosome 

Figure 10 
1: 

TAGCTTGGCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACCCA^^ 

ATATACCTGCCGTTCACTAT 

m: 

TATCGGTGGCCGTGGTGTCGGCTCCGCCGCCTTCATACTGCACCGGGCGGGAAGGCGATl'CCGAAGCCC 
AACCTTTCATAGAAGCC 

template: pIB279 

targeting vector: pSV-paXl 

I*: GCTTGGCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAA 
m*: 

TCGGTGGCCGTGGTGTCGGCTCCGCCGCCTTCATACTGCACCGGGCGGGAAGGATCCACAGATTTGATC 
CAGCGATACAGC 

template: pSV-pazll 

targeting vector pSV-sacB-neo 

Figure! 1 
n: 

TACCGCATTAAAGCTTATCGATGATAAGCTGTCAAACATGAGAATTGACCCGGAACCCTTCTCGAGGAA 
GTTCCTATTCTCTAGAAAGTATAGGAACTTCCGAATAAATACCTGTGACGGAAGATCACTT 
f P' 

TTCCCTCAAGAATTTTACTCTGTCAGAAACGGCCTTAACGACGTAGTCGAGGGACCTAGAAGTTCCTAT 

ACTTTCTAGAGAATAGGAACTTCATTATCACTTATTCAGGCGTAGCACCAGGCG 

template: pMAK705 

targeting vector. Hoxa-PI 

Figure 12 
left: 

TGAGACAATAACCCTGATAAATGCTTCAATAATATTCAAAAAGGAAGAGTATGGAGAAAAAAATCACT 

GGATATACCACCG 

right: 

TACAGGGCGCGTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACGCCCCGCCCTGC 
CACTCATCGCA 

template: pMAK705 

targeting vector: pBAD-24 backbone Amp resistant gene 
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particular, it relics on the use of the E. coli 
RecE and RccT proteins, the bacteriophage 
Red-alpha and Red-beta proteins, or the 
phage P22 recombination system. The 
beneficial effects of concomitant 
c)q)ression of the RecBC mhibitor genes 
(e.g. Red-Gamma) is also exampltfied. 
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AMENDED CLAIMS 



[received by the International Bureau on 04 August 1999 (04.08.99); 
original claims 1-50 replaced by new claims 1 -64 (10 pages)] 

1 . A method for cloning DNA molecules In procaryotic cells comprising 
the steps of: 

a) providing a procaryotic host cell capable of performing 
homologous recombination, 

b) contacting in said host oell a circular first DNA molecule 
which is capable of being replicated in said host cell with a 
second DNA molecule comprising at least two regions of 
sequence homology to regions on the first DNA molecule, 
under conditions which favour homologous recombination 
between said first and second DNA molecules and 

c) selecting a host cell in which homologous recombination 
between said first and second DNA molecules has occurred. 

2. The method according to claim 1 wherein the homologous 
recombination occurs via the recET cloning mechanism. 

3. The method according to claim 2 wherein the host cell is capable of 
expressing recE and recT genes. 



4. The method according to claim 3 wherein the recE and recT genes 
are selected from E.coli recE and recT genes or from A reda and redB 
genes. 

5. The method according to claim 3 or 4 wherein the host cell is 
transformed with at least one vector capable of expressing recE 
and/or recT genes. 

6. The method of claim 3, 4 or 5 wherein the expression of the recE 
and/or recT genes is under control of a regulatable promoter. 
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The method of claim 5 or 6 wherein the recT gene is overexpressed 
versus the recE gene. 

The method according to any one of claims 3 to 7 wherein the recE 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 1320 (ATG) to 2159 
(GAC) as depicted in Fig.7B, 

(b) the nucleic aqid sequence from position 1320 (ATG) to 1998 
(CGA) as depicted in Fig.13B, 

(c) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequence from (a), (b) and/or (c). 

The method according to any one of claims 3 to 8 wherein the recT 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 2155 (ATG) to 2961 
(GAA) as depicted in Fig.7B, 

(b) the nucleic acid sequence from position 2086 (ATG) to 2868 
(GCA) as depicted in Fig. 138, 

(c) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequences from (a), (b) and/or (c). 

The method according to any one of the previous claims wherein the 
host cell is a gram-negative bacterial cell. 

The method according to claim 10 wherein the host cell is an 
Escherichia coli cell. 
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1 2. The method according to claim 1 1 wherein the host cell is an 
Escherichia coli K12 strain. 

13. The method according to claim 12 wherein the E.coli strain is 
selected from JC 8679 and JC 9604. 

1 4. The method according to any one of the previous claims wherein the 
host cell further is capable of expressing a recBC inhibitor gene. 

15. The method according to claim 14 wherein the host cell is 
transformed with a vector expressing the recBC inhibitor gene. 

1 6. The method according to claim 1 4 or 1 5 wherein the recBC inhibitor 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 3588 (ATG) to 4002 
(GTA) as depicted in Fig.13B, 

(b) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(c) a nucleic acid sequence which hybridizes under stringent 
conditions (as defined above) with the nucleic acid sequence from (a) 
and/ or (b). 

1 7. The method according to any one of claims 1 3 to 16 wherein the 
host cell is a prokaryotic recBC + cell. 

1 8. The method according to any one of the previous claims wherein the 
first DN A molecule is an extrachromosomal DNA molecule containing 
an origin of replication which is operative in the host cell. 

1 9. The method according to claim 1 8 wherein the first DNA molecule is 
selected from plasmids, cosmids, PI vectors, BAG vectors and PAC 
vectors. 
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20. The method according to any one of claims 1-18 wherein the first 
DNA molecule is a host cell chromosome. 

21 . The method according to any one of the previous claims wherein the 
second DNA molecule is linear. 

22. The method according to any one of the previous claims wherein the 
regions of sequence homology are at least 1 5 nucleotides each. 

23. The method according to one of claims 1 to 1 6 wherein the second 
DNA molecule is obtained fay an amplification reaction. 

24. The method according to one of the previous claims wherein the first 
and/or second DNA molecules are introduced into the host ceils by 
transformation. 

25. The method according to claim 24 wherein the transformation 
method is electroporation. 

26. The method according to one of claims 1 to 25 wherein the first and 
second DNA molecules are introduced into the host cell 
simultaneously by co-transformation. 

27. The method according to one of claims 1 to 25 wherein the second 
DNA molecule is introduced into a host cell in which the first DNA 
molecule is already present. 

28. The method according to one of the previous claims wherein the 
second DNA molecule contains at least one marker gene placed 
between the two regions of s-;quence homology and wherein 
homologous recombination is detected by expression of said marker 
gene. 
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29. The method according to claim 28 wherein gene presence Is selected 
from antibiotic resistance genes, deficiency complementation genes 
and reporter genes. 

30. The method of any one of claims 1 to 29 wherein the first DNA 
molecule contains at least one marker gene between the two regions 
of sequence homology and wherein homologous recombination Is 
detected by lack of expression of said marker gene. 

31 . The method of any one of claims 1 to 30 wherein said marker gene 
Is selected from genes which, under selected conditions, convey a 
toxic or bacteriostatic effect on the cell, and reporter genes. 

32. A method according to any one of the previous claims wherein the 
first DNA molecule contains at least one target site for a site specific 
recombinase between the two regions of sequence homology and 
wherein homologous recombination is detected by removal of said 
target site. 

33. A method for cloning DNA molecules comprising the steps of: 

(a) providing a source of RecE and RecT proteins, 

(b) contacting a first DNA molecule which is capable of being 
replicated in a suitable host cell with a second DNA molecule 
comprising at least two regions of sequence homology to regions on 
the first DNA molecule, under conditions which favour homologous 
recombination between said first and second DMA molecules and 

(c) selecting DNA molecules in which homologous recombination 
between said first and second DNA molecules has occurred. 

34. The method of claim 33 wherein said RecE and RecT or proteins are 
selected from E.coli RecE and RecT proteins or from phage A Reda 
and Rede, proteins. 
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35. The method of claim 33 or 34 wherein the recombination occurs in 
vitro, 

36. The method of claim 33 or 34 wherein the recombination occurs in 
vivo. 

37. A method for making a recombinant DNA molecule comprising 
Introducing into a prokaryotic host cell a circular first DNA molecule 
which is capable of being replicated in said host cell, and Introducing 
a second DNA molecule comprising a first and a second region of 
sequence homology to a third and fourth region, respectively, on the 
first DNA molecule, said host cell being capable of performing 
homologous recombination, such that a recombinant DNA molecule 
is made, said recombinant DNA molecule comprising the first DNA 
molecule wherein the sequences between said third and fourth 
regions have been replaced by sequences between the first and 
second regions of the second DNA molecule. 

38. The method ? ocording to claim 37 which further comprises detecting 
the recombinant DNA molecule. 

39. A method for making a recombinant DNA molecule comprising 
introducing into a prokaryotic host cell, containing a chromosomal 
first DNA molecule, a second DNA molecule comprising a first and 
a second region of sequence homology to a third and a fourth region, 
respectively, on the host chromosomal first DNA molecule, said host 
cell being capable of performing homologous recombination, such 
that a recombinant DNA molecule is made, said recombinant DNA 
molecule comprising the chromosomal first DNA molecule wherein 
the sequences between said third and fourth regions have been 
replaced by sequences between the first and second regions of the 
second DNA molecule. 
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The method according to claim 39 which further comprises detecting 
the recombinant DNA molecule. 

The method according to any one of claims 37 to 40, wherein the 
host cell is capable of expressing RecE and RecT proteins or Aexo 
and proteins. 

A method for cloning DNA molecules comprising the steps of: 

(a) contacting in vitro a first DNA molecule with a second DNA 
molecule comprising at least two regions of sequence 
homology to regions on the first DNA molecule, in the 
presence of RecE and RecT proteins and under conditions 
which favour homologous recombination between said first 
and second DNA molecules; and 

(b) selecting a DNA molecule in which homologous recombination 
between said first and second DNA molecules has occurred. 

A method for making a recombinant DNA molecule comprising 
contacting in vitro a first DNA molecule with a second DNA molecule 
comprising a first and a second region of sequence homology to a 
third and a fourth region on the first DNA molecule, in the presence 
of RecE and RecT proteins and under conditions in which 
homologous recombination can occur, such that a recombinant DNA 
molecule is made, said recombinant DNA molecule comprising the 
first DNA molecule wherein the sequences between said third and 
fourth regions have been replaced by sequences between the first 
and second regions of the second DNA molecule. 

The method of claim 42, which further comprises between steps (a) 
and (b) the step of introducing the product step (a) into a cell, 
wherein recombination occurs in the cell. 
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45. Use of cells capable of expressing the recE and recT genes as a host 
cell for a cloning method involving homologous recombination. 

46. Use of a vector system capable of expressing recE and recT genes 
in a host cell for a cloning method involving homologous 
recombination. 

47. Use of claims 45 or 46 wherein the recE and recT genes are selected 
from E.coll recE and recT genes or from A reda and redB genes. 

48. Use of a source of RecE and RecT proteins for a cloning method 
involving homologous recombination. 

49. Use of claim 48 wherein said RecE and RecT or proteins are selected 
from E.coli RecE and RecT proteins or from phage A Reda and RedS 
proteins. 

50. A reagent kit for cloning comprising 

(a) a host cell 

(b) means of expressing recE and recT genes in said host cell and 

(c) a recipient cloning vehicle capable of being replicated in said cell. 

51. The reagent kit according to claim 50 wherein the means (b) 
comprise a vector system capable of expressing the recE and recT 
genes in the host cell. 

52. The reagent kit according to claim 50 or 51 wherein the recE and 
recT genes are selected from E.coll recE and recT genes or from A 
reda and redl^ genes. 

53. A reagent kit for cloning comprising 

(a) a source for Re :E and RecT proteins and 
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(b) a recipient cloning vehicle capable of being propagated in a host 
cell. 

54. The reagent kit according to claim 53 further comprising a host cell 
suitable for propagating said recipient cloning vehicle. 

55. The reagent kit according to claim 53 or 54 wherein said RecE and 
RecT or proteins are selected from E.coli RecE and RecT proteins or 
from phage A Reda and RedB proteins. 

56. The reagent kit according to any one of claims 50-55 further 
comprising means for expressing a site specific recombinase in said 
host cell. 

57. The reagent kit according to any one of claims 50-56 further 
comprising nucleic acid amplification primers comprising a region of 
homology to said recipient cloning vehicle. 

58. A reagent kit for cloning comprising first and second DNA 
amplification primers and a recipient cloning vehicle that is a circular 
DNA molecule, said first DNA amplification primer having a first 
region of sequence homology to a third region on the circular 
recipient cloning vehicle, and said second DNA amplification primer 
having a second region of sequence homology to a fourth region on 
the circular recipient cloning vehicle. 

59. The reagent kit of claim 58, further comprising a prokaryotic host cell 
that Is capable of performing homologous recombination. 

60. The reagent kit of claim 58 or 59, further comprising a means of 
expressing RecE and RecT proteins or Reda and RedB proteins. 
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61 . The reagent kit according to any one of claims 58-60, wherein the 
means comprises a vector system capable of expressing the recE and 
recT genes in the host cell. 

62. The reagent kit according to any one of claims 58-61, further 
comprising a phenotypic marker located in the recipient cloning 
vehicle between the third and fourth regions of sequence homology. 

63. The reagent kit according to any one of claims 58-62, wherein the 
recipient cloning vehicle further comprises a recognition site for a 
site-specific recomblnase on the recipient cloning vehicle between 
the third and fourth regions of sequence homology. 

64. The reagent kit of claim 63, further comprising means for expressing 
a site-specific recombinase in said host cell. 
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